skip to navigation
skip to content

Planet Python

Last update: December 24, 2025 07:44 AM UTC

December 23, 2025


PyCoder’s Weekly

Issue #714: Narwhals, Selenium, Testing Conundrum, and More (Dec. 23, 2025)

December 23, 2025 07:30 PM UTC


Reuven Lerner

Reuven’s 2025 in review

Can you believe that 2025 is almost over? It was full of big events for me, and yet it also whizzed past at breakneck speed. And so, before we start 2026, I want to share a whole bunch of updates on what I’ve done over the last 12 months — and where I plan to […]

The post Reuven’s 2025 in review appeared first on Reuven Lerner.

December 23, 2025 02:44 PM UTC


Real Python

Reading User Input From the Keyboard With Python

Master taking user input in Python to build interactive terminal apps with clear prompts, solid error handling, and smooth multi-step flows.

December 23, 2025 02:00 PM UTC


Hugo van Kemenade

And now for something completely different

Starting in 2019, Python 3.8 and 3.9 release manager Łukasz Langa added a new section to the release notes called “And now for something completely different” with a sketch transcript from Monty Python.

For Python 3.10 and 3.11, the next release manager Pablo Galindo Salgado continued the section but included astrophysics facts.

For Python 3.12, the next RM Thomas Wouters shared poems (and took a break for 3.13).

And for Python 3.14, I’m doing all things π, pie and [mag]pie.

Here’s a collection of my different things for the first year (and a bit) of Python 3.14.

alpha 1 #

2024-10-15

π (or pi) is a mathematical constant, approximately 3.14, for the ratio of a circle’s circumference to its diameter. It is an irrational number, which means it cannot be written as a simple fraction of two integers. When written as a decimal, its digits go on forever without ever repeating a pattern.

Here’s 76 digits of π:

3.141592653589793238462643383279502884197169399375105820974944592307816406286

Piphilology is the creation of mnemonics to help remember digits of π.

In a pi-poem, or “piem”, the number of letters in a word equal the corresponding digit. This covers 9 digits, 3.14159265:

How I wish I could recollect pi easily today!

One of the most well-known covers 15 digits, 3.14159265358979:

How I want a drink, alcoholic of course, after the heavy chapters involving quantum mechanics!

Here’s a 35-word piem in the shape of a circle, 3.1415926535897932384626433832795728:

It’s a fact A ratio immutable Of circle round and width, Produces geometry’s deepest conundrum. For as the numerals stay random, No repeat lets out its presence, Yet it forever stretches forth. Nothing to eternity.

The Guinness World Record for memorising the most digits is held by Rajveer Meena, who recited 70,000 digits blindfold in 2015. The unofficial record is held by Akira Haraguchi who recited 100,000 digits in 2006.

alpha 2 #

2024-11-19

Ludolph van Ceulen (1540-1610) was a fencing and mathematics teacher in Leiden, Netherlands, and spent around 25 years calculating π (or pi), using essentially the same methods Archimedes employed some seventeen hundred years earlier.

Archimedes estimated π by calculating the circumferences of polygons that fit just inside and outside of a circle, reasoning the circumference of the circle lies between these two values. Archimedes went up to polygons with 96 sides, for a value between 3.1408 and 3.1428, which is accurate to two decimal places.

Van Ceulen used a polygon with half a billion sides. He published a 20-decimal value in his 1596 book Vanden Circkel (“On the Circle”), and later expanded it to 35 decimals:

3.14159265358979323846264338327950288

Van Ceulen’s 20 digits is more than enough precision for any conceivable practical purpose. For example, even if a printed circle was perfect down to the atomic scale, the thermal vibrations of the molecules of ink would make most of those digits physically meaningless. NASA Jet Propulsion Laboratory’s highest accuracy calculations, for interplanetary navigation, uses 15 decimals: 3.141592653589793.

At Van Ceulen’s request, his upper and lower bounds for π were engraved on his tombstone in Leiden. The tombstone was eventually lost but restored in 2000. In the Netherlands and Germany, π is sometimes referred to as the “Ludolphine number”, after Van Ceulen.

alpha 3 #

2024-12-17

A mince pie is a small, round covered tart filled with “mincemeat”, usually eaten during the Christmas season – the UK consumes some 800 million each Christmas. Mincemeat is a mixture of things like apple, dried fruits, candied peel and spices, and originally would have contained meat chopped small, but rarely nowadays. They are often served warm with brandy butter.

According to the Oxford English Dictionary, the earliest mention of Christmas mince pies is by Thomas Dekker, writing in the aftermath of the 1603 London plague, in Newes from Graues-end: Sent to Nobody (1604):

Ten thousand in London swore to feast their neighbors with nothing but plum-porredge, and mince-pyes all Christmas.

Here’s a meaty recipe from Rare and Excellent Receipts, Experienc’d and Taught by Mrs Mary Tillinghast and now Printed for the Use of her Scholars Only (1678):

XV. How to make Mince-pies.

To every pound of Meat, take two pound of beef Suet, a pound of Corrants, and a quarter of an Ounce of Cinnamon, one Nutmeg, a little beaten Mace, some beaten Colves, a little Sack & Rose-water, two large Pippins, some Orange and Lemon peel cut very thin, and shred very small, a few beaten Carraway-seeds, if you love them the Juyce of half a Lemon squez’d into this quantity of meat; for Sugar, sweeten it to your relish; then mix all these together and fill your Pie. The best meat for Pies is Neats-Tongues, or a leg of Veal; you may make them of a leg of Mutton if you please; the meat must be parboyl’d if you do not spend it presently; but if it be for present use, you may do it raw, and the Pies will be the better.

alpha 4 #

2025-01-14

In Python, you can use Greek letters as constants. For example:

from math import pi as π

def circumference(radius: float) -> float:
 return 2 * π * radius

print(circumference(6378.137)) # 40075.016685578485

alpha 5 #

2025-02-11

2025-01-29 marked the start of a new lunar year, the Year of the Snake :snake: (and the Year of Python?).

For centuries, π was often approximated as 3 in China. Some time between the years 1 and 5 CE, astronomer, librarian, mathematician and politician Liu Xin (劉歆) calculated π as 3.154.

Around 130 CE, mathematician, astronomer, and geographer Zhang Heng (張衡, 78–139) compared the celestial circle with the diameter of the earth as 736:232 to get 3.1724. He also came up with a formula for the ratio between a cube and inscribed sphere as 8:5, implying the ratio of a square’s area to an inscribed circle is √8:√5. From this, he calculated π as √10 (~3.162).

Third century mathematician Liu Hui (刘徽) came up with an algorithm for calculating π iteratively: calculate the area of a polygon inscribed in a circle, then as the number of sides of the polygon is increased, the area becomes closer to that of the circle, from which you can approximate π.

This algorithm is similar to the method used by Archimedes in the 3rd century BCE and Ludolph van Ceulen in the 16th century CE (see 3.14.0a2 release notes), but Archimedes only went up to a 96-sided polygon (96-gon). Liu Hui went up to a 192-gon to approximate π as 157/50 (3.14) and later a 3072-gon for 3.14159.

Liu Hu wrote a commentary on the book The Nine Chapters on the Mathematical Art which included his π approximations.

In the fifth century, astronomer, inventor, mathematician, politician, and writer Zu Chongzhi (祖沖之, 429–500) used Liu Hui’s algorithm to inscribe a 12,288-gon to compute π between 3.1415926 and 3.1415927, correct to seven decimal places. This was more accurate than Hellenistic calculations and wouldn’t be improved upon for 900 years.

Happy Year of the Snake!

alpha 6 #

2025-03-14

March 14 is celebrated as pi day, because 3.14 is an approximation of π. The day is observed by eating pies (savoury and/or sweet) and celebrating π. The first pi day was organised by physicist and tinkerer Larry Shaw of the San Francisco Exploratorium in 1988. It is also the International Day of Mathematics and Albert Einstein’s birthday. Let’s all eat some pie, recite some π, install and test some py, and wish a happy birthday to Albert, Loren and all the other pi day children!

alpha 7 #

2025-04-08

On Saturday, 5th April, 3.141592653589793 months of the year had elapsed.

beta 1 #

2025-05-07

The mathematical constant pi is represented by the Greek letter π and represents the ratio of a circle’s circumference to its diameter. The first person to use π as a symbol for this ratio was Welsh self-taught mathematician William Jones in 1706. He was a farmer’s son born in Llanfihangel Tre’r Beirdd on Angelsy (Ynys Môn) in 1675 and only received a basic education at a local charity school. However, the owner of his parents’ farm noticed his mathematical ability and arranged for him to move to London to work in a bank.

By age 20, he served at sea in the Royal Navy, teaching sailors mathematics and helping with the ship’s navigation. On return to London seven years later, he became a maths teacher in coffee houses and a private tutor. In 1706, Jones published Synopsis Palmariorum Matheseos which used the symbol π for the ratio of a circle’s circumference to diameter (hunt for it on pages 243 and 263 or here). Jones was also the first person to realise π is an irrational number, meaning it can be written as decimal number that goes on forever, but cannot be written as a fraction of two integers.

But why π? It’s thought Jones used the Greek letter π because it’s the first letter in perimetron or perimeter. Jones was the first to use π as our familiar ratio but wasn’t the first to use it in as part of the ratio. William Oughtred, in his 1631 Clavis Mathematicae (The Key of Mathematics), used π/δ to represent what we now call pi. His π was the circumference, not the ratio of circumference to diameter. James Gregory, in his 1668 Geometriae Pars Universalis (The Universal Part of Geometry) used π/ρ instead, where ρ is the radius, making the ratio 6.28… or τ. After Jones, Leonhard Euler had used π for 6.28…, and also p for 3.14…, before settling on and popularising π for the famous ratio.

beta 2 #

2025-05-26

In 1897, the State of Indiana almost passed a bill defining π as 3.2.

Of course, it’s not that simple.

Edwin J. Goodwin, M.D., claimed to have come up with a solution to an ancient geometrical problem called squaring the circle, first proposed in Greek mathematics. It involves trying to draw a circle and a square with the same area, using only a compass and a straight edge. It turns out to be impossible because π is transcendental (and this had been proved just 13 years earlier by Ferdinand von Lindemann), but Goodwin fudged things so the value of π was 3.2 (his writings have included at least nine different values of π: including 4, 3.236, 3.232, 3.2325… and even 9.2376…).

Goodwin had copyrighted his proof and offered it to the State of Indiana to use in their educational textbooks without paying royalties, provided they endorsed it. And so Indiana Bill No. 246 was introduced to the House on 18th January 1897. It was not understood and initially referred to the House Committee on Canals, also called the Committee on Swamp Lands. They then referred it to the Committee on Education, who duly recommended on 2nd February that “said bill do pass”. It passed its second reading on the 5th and the education chair moved that they suspend the constitutional rule that required bills to be read on three separate days. This passed 72-0, and the bill itself passed 67-0.

The bill was referred to the Senate on 10th February, had its first reading on the 11th, and was referred to the Committee on Temperance, whose chair on the 12th recommended “that said bill do pass”.

A mathematics professor, Clarence Abiathar Waldo, happened to be in the State Capitol on the day the House passed the bill and walked in during the debate to hear an ex-teacher argue:

The case is perfectly simple. If we pass this bill which establishes a new and correct value for pi , the author offers to our state without cost the use of his discovery and its free publication in our school text books, while everyone else must pay him a royalty.

Waldo ensured the senators were “properly coached”; and on the 12th, during the second reading, after an unsuccessful attempt to amend the bill it was postponed indefinitely. But not before the senators had some fun.

The Indiana News reported on the 13th:

…the bill was brought up and made fun of. The Senators made bad puns about it, ridiculed it and laughed over it. The fun lasted half an hour. Senator Hubbell said that it was not meet for the Senate, which was costing the State $250 a day, to waste its time in such frivolity. He said that in reading the leading newspapers of Chicago and the East, he found that the Indiana State Legislature had laid itself open to ridicule by the action already taken on the bill. He thought consideration of such a proposition was not dignified or worthy of the Senate. He moved the indefinite postponement of the bill, and the motion carried.

beta 3 #

2025-06-17

If you’re heading out to sea, remember the Maritime Approximation:

π mph = e knots

beta 4 #

2025-07-08

All this talk of π and yet some say π is wrong. Tau Day (June 28th, 6/28 in the US) celebrates τ as the “true circle constant”, as the ratio of a circle’s circumference to its radius, C/r = 6.283185… The Tau Manifesto declares π “a confusing and unnatural choice for the circle constant”, in part because “2π occurs with astonishing frequency throughout mathematics”.

If you wish to embrace τ the good news is PEP 628 added math.tau to Python 3.6 in 2016:

When working with radians, it is trivial to convert any given fraction of a circle to a value in radians in terms of tau. A quarter circle is tau/4, a half circle is tau/2, seven 25ths is 7*tau/25, etc. In contrast with the equivalent expressions in terms of pi (pi/2, pi, 14*pi/25), the unnecessary and needlessly confusing multiplication by two is gone.

release candidate 1 #

2025-07-22

Today, 22nd July, is Pi Approximation Day, because 22/7 is a common approximation of π and closer to π than 3.14.

22/7 is a Diophantine approximation, named after Diophantus of Alexandria (3rd century CE), which is a way of estimating a real number as a ratio of two integers. 22/7 has been known since antiquity; Archimedes (3rd century BCE) wrote the first known proof that 22/7 overestimates π by comparing 96-sided polygons to the circle it circumscribes.

Another approximation is 355/113. In Chinese mathematics, 22/7 and 355/113 are respectively known as Yuelü (约率; yuēlǜ; “approximate ratio”) and Milü (密率; mìlǜ; “close ratio”).

Happy Pi Approximation Day!

release candidate 2 #

2025-08-14

The magpie, Pica pica in Latin, is a black and white bird in the crow family, known for its chattering call.

The first-known use in English is from a 1589 poem, where magpie is spelled “magpy” and cuckoo is “cookow”:

Th[e]y fly to wood like breeding hauke, And leave old neighbours loue, They pearch themselves in syluane lodge, And soare in th’ aire aboue. There : magpy teacheth them to chat, And cookow soone doth hit them pat.

The name comes from Mag, short for Margery or Margaret (compare robin redbreast, jenny wren, and its corvid relative jackdaw); and pie, a magpie or other bird with black and white (or pied) plumage. The sea-pie (1552) is the oystercatcher, the grey pie (1678) and murdering pie (1688) is the great grey shrike. Others birds include the yellow and black pie, red-billed pie, wandering tree-pie, and river pie. The rain-pie, wood-pie and French pie are woodpeckers.

Pie on its own dates to before 1225, and comes from the Latin name for the bird, pica.

release candidate 3 #

2025-09-18

According to Pablo Galindo Salgado at PyCon Greece:

There are things that are supercool indeed, like for instance, this is one of the results that I’m more proud about. This equation over here, which you don’t need to understand, you don’t need to be scared about, but this equation here tells what is the maximum time that it takes for a ray of light to fall into a black hole. And as you can see the math is quite complicated but the answer is quite simple: it’s 2π times the mass of the black hole. So if you normalise by the mass of the black hole, the answer is 2π. And because there is nothing specific about your election of things in this formula, this formula is universal. It means it doesn’t depend on anything other than nature itself. Which means that you can use this as a definition of π. This is a valid alternative definition of the number π. It’s literally half the maximum time it takes to fall into a black hole, which is kind of crazy. So next time someone asks you what π means you can just drop this thing and impress them quite a lot. Maybe Hugo could use this information to put it into the release notes of πthon [yes, I can, thank you!].

3.14.0 (final) #

2025-10-07

Edgar Allen Poe died on 7th October 1849.

As we all recall from 3.14.0a1, piphilology is the creation of mnemonics to help memorise the digits of π, and the number of letters in each word in a pi-poem (or “piem”) successively correspond to the digits of π.

In 1995, Mike Keith, an American mathematician and author of constrained writing, retold Poe’s The Raven as a 740-word piem. Here’s the first two stanzas of Near A Raven:

Poe, E. Near a Raven

Midnights so dreary, tired and weary. Silently pondering volumes extolling all by-now obsolete lore. During my rather long nap - the weirdest tap! An ominous vibrating sound disturbing my chamber’s antedoor. “This”, I whispered quietly, “I ignore”.

Perfectly, the intellect remembers: the ghostly fires, a glittering ember. Inflamed by lightning’s outbursts, windows cast penumbras upon this floor. Sorrowful, as one mistreated, unhappy thoughts I heeded: That inimitable lesson in elegance - Lenore - Is delighting, exciting…nevermore.

3.14.1 #

2025-12-02

Seki Takakazu (関 孝和; c. March 1642 – December 5, 1708) was a Japanese mathematician and samurai who laid the foundations of Japanese mathematics, later known as wasan (和算, from wa (“Japanese”) and san (“calculation”).

Seki was a contemporary of Isaac Newton and Gottfried Leibniz but worked independently. He created a new algebraic system, worked on infinitesimal calculus, and is credited with the discovery of Bernoulli numbers (before Bernoulli’s birth).

Seki also calculated π to 11 decimal places using a polygon with 131,072 sides inscribed within a circle, using an acceleration method now known as Aitken’s delta-squared process, which was rediscovered by Alexander Aitken in 1926.


Header photo: A scan of Seki Takakazu’s posthumous Katsuyō Sanpō (1712) showing calculations of π.

December 23, 2025 01:03 PM UTC


Real Python

Quiz: Recursion in Python: An Introduction

Test your understanding of recursion in Python, including base cases, recursive structure, performance considerations, and common use cases.

December 23, 2025 12:00 PM UTC


"Michael Kennedy's Thoughts on Technology"

Python Supply Chain Security Made Easy

Maybe you’ve heard that hackers have been trying to take advantage of open source software to inject code into your machine, and worst case scenario, even the consumers of your libraries or your applications machines. In this quick post, I’ll show you how to integrate Python’s “Official” package scanning technology directly into your continuous integration and your project’s unit tests. While pip-audit is maintained in part by Trail of Bits with support from Google, it’s part of the PyPA organization.

Why this matters

Here are 5 recent, high-danger PyPI security issues supply chain attacks where “pip install” can turn into “pip install a backdoor.” Afterwards, we talk about how to scan for and prevent these from making it to your users.

What happened: A malicious version (8.3.41) of the widely-used ultralytics package was published to PyPI, containing code that downloaded the XMRig coinminer. Follow-on versions also carried the malicious downloader, and the writeup attributes the initial compromise to a GitHub Actions script injection, plus later abuse consistent with a stolen PyPI API token. Source: ReversingLabs

Campaign of fake packages stealing cloud access tokens, 14,100+ downloads before removal

What happened: Researchers reported multiple bogus PyPI libraries (including “time-related utilities”) designed to exfiltrate cloud access tokens, with the campaign exceeding 14,100 downloads before takedown. If those tokens are real, this can turn into cloud account takeover. Source: The Hacker News

Typosquatting and name-confusion targeting colorama, with remote control and data theft payloads

What happened: A campaign uploaded lookalike package names to PyPI to catch developers intending to install colorama, with payloads described as enabling persistent remote access/remote control plus harvesting and exfiltration of sensitive data. High danger mainly because colorama is popular and typos happen. Source: Checkmarx

PyPI credential-phishing led to real account compromise and malicious releases of a legit project (num2words)

What happened: PyPI reported an email phishing campaign using a lookalike domain; 4 accounts were successfully phished, attacker-generated API tokens were revoked, and malicious releases of num2words were uploaded then removed. This is the “steal maintainer creds, ship malware via trusted package name” playbook. Source: Python Package Index Blog

SilentSync RAT delivered via malicious PyPI packages (sisaws, secmeasure)

What happened: Zscaler documented malicious packages (including typosquatting) that deliver a Python-based remote access trojan (RAT) with command execution, file exfiltration, screen capture, and browser data theft (credentials, cookies, etc.). Source: Zscaler

Integrating pip-audit

Those are definitely scary situations. I’m sure you’ve heard about typo squatting and how annoying that can be. Caution will save you there. Where caution will not save you is when a legitimate package has its supply chain taken over. A lot of times this could look like a package that you use depends on another package whose maintainer was phished. And now everything that uses that library is carrying that vulnerability forward.

Enter pip-audit.

pip-audit is great because you can just run it on the command line. It will check against PyPA’s official list of vulnerabilities and tell you if anything in your virtual environment or requirements files is known to be malicious.

You could even set up a GitHub Action to do so, and I wouldn’t recommend against that at all. But it’s also valuable to make this check happen on developers’ machines. It’s a simple two-step process to do so:

  1. Add pip-audit to your project’s development dependencies or install it globally with uv tool install pip-audit.
  2. Create a unit test that simply shells out to execute pip-audit and fails the test if an issue is found.

Part one’s easy. Part two takes a little bit more work. That’s okay, because I got it for you. Just download the file here and drop it in your pytest test directory:

test_pypi_security_audit.py

Here’s a small segment to give you a sense of what’s involved.

def test_pip_audit_no_vulnerabilities():
	  # setup ...
    # Run pip-audit with JSON output for easier parsing
    try:
        result = subprocess.run(
            [
                sys.executable,
                '-m',
                'pip_audit',
                '--format=json',
                '--progress-spinner=off',
                '--ignore-vuln',
                'CVE-2025-53000', # example of skipping an irrelevant cve
                '--skip-editable', # don't test your own package in dev
            ],
            cwd=project_root,
            capture_output=True,
            text=True,
            timeout=120,  # 2 minute timeout
        )
    except subprocess.TimeoutExpired:
        pytest.fail('pip-audit command timed out after 120 seconds')
    except FileNotFoundError:
        pytest.fail('pip-audit not installed or not accessible')

That’s it! When anything runs your unit test, whether that’s continuous integration, a git hook, or just a developer testing their code, you’ll also run a pip-audit audit of your project.

Let others find out

Now, pip-audit tests if a malicious package has been installed, In which case, for that poor developer or machine, it may be too late. If it’s CI, who cares? But one other feature you can combine with this that is really nice is uv’s ability to put a delay on upgrading your dependencies.

Many developers, myself included, will typically run some kind of command that will pin your versions. Periodically we also run a command that looks for newer libraries and updates pinned versions so we’re using the latest code. So this way you upgrade in a stair-step manner at the time you’re intending to change versions.

This works great. However, what if the malicious version of a package is released five minutes before before you run this command. You’re getting it installed. But pretty soon, the community is going to find out that something is afoot, report it, and it will be yanked from PyPI. Here bad timing got you hacked.

While it’s not a guaranteed solution, certainly Defense In Depth would tell us maybe wait a few days to install a package. But you don’t want to review packages manually one by one, do you? For example, for Talk Python Training, we have over 200 packages for that website. It would be an immense hassle to verify the dates of each one and manually pick the versions.

No need! We can just add a simple delay to our uv command:

uv pip compile requirements.piptools --upgrade --output-file requirements.txt --exclude-newer "1 week"

In particular, notice –exclude-newer “1 week”. The exact duration isn’t the important thing. It’s about putting a little bit of a delay for issues to be reported into your workflow. You can read about the full feature here. This way, we only incorporate packages that have survived in the public on PyPI for at least one week.

Hope this helps. Stay safe out there.

December 23, 2025 12:16 AM UTC


Armin Ronacher

Advent of Slop: A Guest Post by Claude

December 23, 2025 12:00 AM UTC

December 22, 2025


EuroPython Society

EPS Board 2025-2026

We’re happy to announce our new board for the 2025-2026 term:

You can read more about them in their nomination post at https://www.europython-society.org/list-of-eps-board-candidates-for-2025-2026/. The minutes and

December 22, 2025 11:21 PM UTC


Giampaolo Rodola

C heap introspection in psutil

Memory leaks in Python are often straightforward to diagnose. Just look at RSS, track Python object counts, follow reference graphs. But leaks inside C extension modules are another story. Traditional memory metrics such as RSS and VMS frequently fail to reveal them because Python's memory allocator sits above the platform's …

December 22, 2025 11:00 PM UTC


Real Python

SOLID Design Principles: Improve Object-Oriented Code in Python

Learn how to apply SOLID design principles in Python and build maintainable, reusable, and testable object-oriented code.

December 22, 2025 02:00 PM UTC

Quiz: SOLID Design Principles: Improve Object-Oriented Code in Python

Learn Liskov substitution in Python. Spot Square and Rectangle pitfalls and design safer APIs with polymorphism. Test your understanding now.

December 22, 2025 12:00 PM UTC


Nicola Iarocci

Rediscovering a 2021 podcast on Python, .NET, and open source

Yesterday, the kids came home for the Christmas holidays. Marco surprised me by telling me that on his flight from Brussels, he discovered and listened to “my podcast” on Spotify. I was stunned. I didn’t remember ever recording a podcast, even though I’ve given a few interviews here and there over the years.

During my usual morning walk today, I went to look for it, and there it was, an interview I had done in 2021 that I had completely forgotten about. I got over the initial embarrassment (it’s always strange to hear your own voice) and resisted the temptation to turn it off, listening to it all the way through. I must admit that it captures that moment in my professional life, and much of the content is still relevant, especially regarding my experience as an open-source author and maintainer and my transition from C# to Python and back.

December 22, 2025 09:49 AM UTC


Python Bytes

#463 2025 is @wrapped

Topics include Has the cost of building software just dropped 90%?, , How FOSS Won and Why It Matters, and.

December 22, 2025 08:00 AM UTC


Zato Blog

Modern REST API Tutorial in Python

Modern REST API Tutorial in Python

Great APIs don't win theoretical arguments - they just prefer to work reliably and to make developers' lives easier.

Here's a tutorial on what building production APIs is really about: creating interfaces that are practical in usage, while keeping your systems maintainable for years to come.

Sound intriguing? Read the modern REST API tutorial in Python here.

Modern REST API tutorial in Python

More resources

➤ Python API integration tutorials
What is a Network Packet Broker? How to automate networks in Python?
What is an integration platform?
Python Integration platform as a Service (iPaaS)
What is an Enterprise Service Bus (ESB)? What is SOA?
Open-source iPaaS in Python

December 22, 2025 03:00 AM UTC


Armin Ronacher

A Year Of Vibes

December 22, 2025 12:00 AM UTC


Seth Michael Larson

PEP 770 Software Bill-of-Materials (SBOM) data from PyPI, Fedora, and Red Hat

December 22, 2025 12:00 AM UTC

December 21, 2025


Ned Batchelder

Generating data shapes with Hypothesis

In my last blog post (A testing conundrum), I described trying to test my Hasher class which hashes nested data. I couldn’t get Hypothesis to generate usable data for my test. I wanted to assert that two equal data items would hash equally, but Hypothesis was finding pairs like [0] and [False]. These are equal but hash differently because the hash takes the types into account.

In the blog post I said,

If I had a schema for the data I would be comparing, I could use it to steer Hypothesis to generate realistic data. But I don’t have that schema...

I don’t want a fixed schema for the data Hasher would accept, but tests to compare data generated from the same schema. It shouldn’t compare a list of ints to a list of bools. Hypothesis is good at generating things randomly. Usually it generates data randomly, but we can also use it to generate schemas randomly!

Hypothesis basics

Before describing my solution, I’ll take a quick detour to describe how Hypothesis works.

Hypothesis calls their randomness machines “strategies”. Here is a strategy that will produce random integers between -99 and 1000:

import hypothesis.strategies as st
st.integers(min_value=-99, max_value=1000)

Strategies can be composed:

st.lists(st.integers(min_value=-99, max_value=1000), max_size=50)

This will produce lists of integers from -99 to 1000. The lists will have up to 50 elements.

Strategies are used in tests with the @given decorator, which takes a strategy and runs the test a number of times with different example data drawn from the strategy. In your test you check a desired property that holds true for any data the strategy can produce.

To demonstrate, here’s a test of sum() that checks that summing a list of numbers in two halves gives the same answer as summing the whole list:

from hypothesis import given, strategies as st

@given(st.lists(st.integers(min_value=-99, max_value=1000), max_size=50))
def test_sum(nums):
    # We don't have to test sum(), this is just an example!
    mid = len(nums) // 2
    assert sum(nums) == sum(nums[:mid]) + sum(nums[mid:])

By default, Hypothesis will run the test 100 times, each with a different randomly generated list of numbers.

Schema strategies

The solution to my data comparison problem is to have Hypothesis generate a random schema in the form of a strategy, then use that strategy to generate two examples. Doing this repeatedly will get us pairs of data that have the same “shape” that will work well for our tests.

This is kind of twisty, so let’s look at it in pieces. We start with a list of strategies that produce primitive values:

primitives = [
    st.none(),
    st.booleans(),
    st.integers(min_value=-1000, max_value=10_000_000),
    st.floats(min_value=-100, max_value=100),
    st.text(max_size=10),
    st.binary(max_size=10),
]

Then a list of strategies that produce hashable values, which are all the primitives, plus tuples of any of the primitives:

def tuples_of(elements):
    """Make a strategy for tuples of some other strategy."""
    return st.lists(elements, max_size=3).map(tuple)

# List of strategies that produce hashable data.
hashables = primitives + [tuples_of(s) for s in primitives]

We want to be able to make nested dictionaries with leaves of some other type. This function takes a leaf-making strategy and produces a strategy to make those dictionaries:

def nested_dicts_of(leaves):
    """Make a strategy for recursive dicts with leaves from another strategy."""
    return st.recursive(
        leaves,
        lambda children: st.dictionaries(st.text(max_size=10), children, max_size=3),
        max_leaves=10,
    )

Finally, here’s our strategy that makes schema strategies:

nested_data_schemas = st.recursive(
    st.sampled_from(primitives),
    lambda children: st.one_of(
        children.map(lambda s: st.lists(s, max_size=5)),
        children.map(tuples_of),
        st.sampled_from(hashables).map(lambda s: st.sets(s, max_size=10)),
        children.map(nested_dicts_of),
    ),
    max_leaves=3,
)

For debugging, it’s helpful to generate an example strategy from this strategy, and then an example from that, many times:

for _ in range(50):
    print(repr(nested_data_schemas.example().example()))

Hypothesis is good at making data we’d never think to try ourselves. Here is some of what it made:

[None, None, None, None, None]
{}
[{False}, {False, True}, {False, True}, {False, True}]
{(1.9, 80.64553337755876), (-41.30770818038395, 9.42967906108538, -58.835811641800085), (31.102786990742203,), (28.2724197133397, 6.103515625e-05, -84.35107066147154), (7.436329211943294e-263,), (-17.335739410320514, 1.5029061311609365e-292, -8.17077562035881), (-8.029363284353857e-169, 49.45840191722425, -15.301768150196054), (5.960464477539063e-08, 1.1518373121077722e-213), (), (-0.3262457914511714,)}
[b'+nY2~\xaf\x8d*\xbb\xbf', b'\xe4\xb5\xae\xa2\x1a', b'\xb6\xab\xafEi\xc3C\xab"\xe1', b'\xf0\x07\xdf\xf5\x99', b'2\x06\xd4\xee-\xca\xee\x9f\xe4W']
{'fV': [81.37177374286324, 3.082323424992609e-212, 3.089885728465406e-151, -9.51475773638932e-86, -17.061851038597922], 'J»\x0c\x86肭|\x88\x03\x8aU': [29.549966208819654]}
[{}, -68.48316192397687]
None
['\x85\U0004bf04°', 'pB\x07iQT', 'TRUE', '\x1a5ùZâ\U00048752\U0005fdf8ê', '\U000fe0b9m*¤\U000b9f1e']
(14.232866652585258, -31.193835515904652, 62.29850355163285)
{'': {'': None, \U000be8de§\nÈ\U00093608u': None, 'Y\U000709e4¥ùU)GE\U000dddc5¬': None}}
[{(), (b'\xe7', b'')}, {(), (b'l\xc6\x80\xdf\x16\x91', b'', b'\x10,')}, {(b'\xbb\xfb\x1c\xf6\xcd\xff\x93\xe0\xec\xed',), (b'g',), (b'\x8e9I\xcdgs\xaf\xd1\xec\xf7', b'\x94\xe6#', b'?\xc9\xa0\x01~$k'), (b'r', b'\x8f\xba\xe6\xfe\x92n\xc7K\x98\xbb', b'\x92\xaa\xe8\xa6s'), (b'f\x98_\xb3\xd7', b'\xf4+\xf7\xbcU8RV', b'\xda\xb0'), (b'D',), (b'\xab\xe9\xf6\xe9', b'7Zr\xb7\x0bl\xb6\x92\xb8\xad', b'\x8f\xe4]\x8f'), (b'\xcf\xfb\xd4\xce\x12\xe2U\x94mt',), (b'\x9eV\x11', b'\xc5\x88\xde\x8d\xba?\xeb'), ()}, {(b'}', b'\xe9\xd6\x89\x8b')}, {(b'\xcb`', b'\xfd', b'w\x19@\xee'), ()}]
((), (), ())

Finally writing the test

Time to use all of this in a test:

@given(nested_data_schemas.flatmap(lambda s: st.tuples(s, s)))
def test_same_schema(data_pair):
    data1, data2 = data_pair
    h1, h2 = Hasher(), Hasher()
    h1.update(data1)
    h2.update(data2)
    if data1 == data2:
        assert h1.digest() == h2.digest()
    else:
        # Strictly speaking, unequal data could produce equal hashes,
        # but it's very unlikely, so test for it anyway.
        assert h1.digest() != h2.digest()

Here I use the .flatmap() method to draw an example from the nested_data_schemas strategy and call the provided lambda with the drawn example, which is itself a strategy. The lambda uses st.tuples to make tuples with two examples drawn from the strategy. So we get one data schema, and two examples from it as a tuple passed into the test as data_pair. The test then unpacks the data, hashes them, and makes the appropriate assertion.

This works great: the tests pass. To check that the test was working well, I made some breaking tweaks to the Hasher class. If Hypothesis is configured to generate enough examples, it finds data examples demonstrating the failures.

I’m pleased with the results. Hypothesis is something I’ve been wanting to use more, so I’m glad I took this chance to learn more about it and get it working for these tests. To be honest, this is way more than I needed to test my Hasher class. But once I got started, I wanted to get it right, and learning is always good.

I’m a bit concerned that the standard setting (100 examples) isn’t enough to find the planted bugs in Hasher. There are many parameters in my strategies that could be tweaked to keep Hypothesis from wandering too broadly, but I don’t know how to decide what to change.

Actually

The code in this post is different than the actual code I ended up with. Mostly this is because I was working on the code while I was writing this post, and discovered some problems that I wanted to fix. For example, the tuples_of function makes homogeneous tuples: varying lengths with elements all of the same type. This is not the usual use of tuples (see Lists vs. Tuples). Adapting for heterogeneous tuples added more complexity, which was interesting to learn, but I didn’t want to go back and add it here.

You can look at the final strategies.py to see that and other details, including type hints for everything, which was a journey of its own.

Postscript: AI assistance

I would not have been able to come up with all of this by myself. Hypothesis is very powerful, but requires a new way of thinking about things. It’s twisty to have functions returning strategies, and especially strategies producing strategies. The docs don’t have many examples, so it can be hard to get a foothold on the concepts.

Claude helped me by providing initial code, answering questions, debugging when things didn’t work out, and so on. If you are interested, this is one of the discussions I had with it.

December 21, 2025 04:43 PM UTC

December 19, 2025


Luke Plant

Help my website is too small

How can it be a real website if it’s less than 7k?

December 19, 2025 01:45 PM UTC


Real Python

The Real Python Podcast – Episode #277: Moving Towards Spec-Driven Development

What are the advantages of spec-driven development compared to vibe coding with an LLM? Are these recent trends a move toward declarative programming? This week on the show, Marc Brooker, VP and Distinguished Engineer at AWS, joins us to discuss specification-driven development and Kiro.

December 19, 2025 12:00 PM UTC

December 18, 2025


Django Weblog

Hitting the Home Stretch: Help Us Reach the Django Software Foundation's Year-End Goal!

As we wrap up another strong year for the Django community, we wanted to share an update and a thank you. This year, we raised our fundraising goal from $200,000 to $300,000, and we are excited to say we are now over 88% of the way there. That puts us firmly in the home stretch, and a little more support will help us close the gap and reach 100%.

So why the higher goal this year? We expanded the Django Fellows program to include a third Fellow. In August, we welcomed Jacob Tyler Walls as our newest Django Fellow. That extra capacity gives the team more flexibility and resilience, whether someone is taking parental leave, time off around holidays, or stepping away briefly for other reasons. It also makes it easier for Fellows to attend more Django events and stay connected with the community, all while keeping the project running smoothly without putting too much pressure on any one person.

We are also preparing to raise funds for an executive director role early next year. That work is coming soon, but right now, the priority is finishing this year strong.

We want to say a sincere thank you to our existing sponsors and to everyone who has donated so far. Your support directly funds stable Django releases, security work, community programs, and the long-term health of the framework. If you or your organization have end-of-year matching funds or a giving program, this is a great moment to put them to use and help push us past the finish line.

If you would like to help us reach that final stretch, you can find all the details on our fundraising page

Other ways to support Django:

Thank you for helping support Django and the people who make it possible. We are incredibly grateful for this community and everything you do to keep Django strong.

December 18, 2025 10:04 PM UTC


Sumana Harihareswara - Cogito, Ergo Sumana

Python Software Foundation, National Science Foundation, And Integrity

Python Software Foundation, National Science Foundation, And Integrity

December 18, 2025 07:43 PM UTC


Django Weblog

Introducing the 2026 DSF Board

Thank You to Our Outgoing Directors

We extend our gratitude to Thibaud Colas and Sarah Abderemane, who are completing their terms on the board. Their contributions shaped the foundation in meaningful ways, and the following highlights only scratch the surface of their work.

Thibaud served as President in 2025 and Secretary in 2024. He was instrumental in governance improvements, the Django CNA initiative, election administration, and creating our first annual report. He also led our birthday campaign and helped with the creation of several new working groups this year. His thoughtful leadership helped the board navigate complex decisions.

Sarah served as Vice President in 2025 and contributed significantly to our outreach efforts, working group coordination, and membership management. She also served as a point of contact for the Django CNA initiative alongside Thibaud.

Both Thibaud and Sarah did too many things to list here. They were amazing ambassadors for the DSF, representing the board at many conferences and events. They will be deeply missed, and we are happy to have their continued membership and guidance in our many working groups.

On behalf of the board, thank you both for your commitment to Django and the DSF. The community is better for your service.

Thank You to Our 2025 Officers

Thank you to Tom Carrick and Jacob Kaplan-Moss for their service as officers in 2025.

Tom served as Secretary, keeping our meetings organized and our records in order. Jacob served as Treasurer, providing careful stewardship of the foundation's finances. Their dedication helped guide the DSF through another successful year.

Welcome to Our Newly Elected Directors

We welcome Priya Pahwa and Ryan Cheley to the board, and congratulate Jacob Kaplan-Moss on his re-election.

2026 DSF Board Officers

The board unanimously elected our officers for 2026:

I'm honored to serve as President for 2026. The DSF has important work ahead, and I'm looking forward to building on the foundation that previous boards have established.

Our monthly board meeting minutes may be found at dsf-minutes, and December's minutes are available.

If you have a great idea for the upcoming year or feel something needs our attention, please reach out to us via our Contact the DSF page. We're always open to hearing from you.

December 18, 2025 06:50 PM UTC


Ned Batchelder

A testing conundrum

Update: I found a solution which I describe in Generating data shapes with Hypothesis.

In coverage.py, I have a class for computing the fingerprint of a data structure. It’s used to avoid doing duplicate work when re-processing the same data won’t add to the outcome. It’s designed to work for nested data, and to canonicalize things like set ordering. The slightly simplified code looks like this:

class Hasher:
    """Hashes Python data for fingerprinting."""

    def __init__(self) -> None:
        self.hash = hashlib.new("sha3_256")

    def update(self, v: Any) -> None:
        """Add `v` to the hash, recursively if needed."""
        self.hash.update(str(type(v)).encode("utf-8"))
        match v:
            case None:
                pass
            case str():
                self.hash.update(v.encode("utf-8"))
            case bytes():
                self.hash.update(v)
            case int() | float():
                self.hash.update(str(v).encode("utf-8"))
            case tuple() | list():
                for e in v:
                    self.update(e)
            case dict():
                for k, kv in sorted(v.items()):
                    self.update(k)
                    self.update(kv)
            case set():
                self.update(sorted(v))
            case _:
                raise ValueError(f"Can't hash {v = }")
        self.hash.update(b".")

    def digest(self) -> bytes:
        """Get the full binary digest of the hash."""
        return self.hash.digest()

To test this, I had some basic tests like:

def test_string_hashing():
    # Same strings hash the same.
    # Different strings hash differently.
    h1 = Hasher()
    h1.update("Hello, world!")
    h2 = Hasher()
    h2.update("Goodbye!")
    h3 = Hasher()
    h3.update("Hello, world!")
    assert h1.digest() != h2.digest()
    assert h1.digest() == h3.digest()

def test_dict_hashing():
    # The order of keys doesn't affect the hash.
    h1 = Hasher()
    h1.update({"a": 17, "b": 23})
    h2 = Hasher()
    h2.update({"b": 23, "a": 17})
    assert h1.digest() == h2.digest()

The last line in the update() method adds a dot to the running hash. That was to solve a problem covered by this test:

def test_dict_collision():
    # Nesting matters.
    h1 = Hasher()
    h1.update({"a": 17, "b": {"c": 1, "d": 2}})
    h2 = Hasher()
    h2.update({"a": 17, "b": {"c": 1}, "d": 2})
    assert h1.digest() != h2.digest()

The most recent change to Hasher was to add the set() clause. There (and in dict()), we are sorting the elements to canonicalize them. The idea is that equal values should hash equally and unequal values should not. Sets and dicts are equal regardless of their iteration order, so we sort them to get the same hash.

I added a test of the set behavior:

def test_set_hashing():
    h1 = Hasher()
    h1.update({(1, 2), (3, 4), (5, 6)})
    h2 = Hasher()
    h2.update({(5, 6), (1, 2), (3, 4)})
    assert h1.digest() == h2.digest()
    h3 = Hasher()
    h3.update({(1, 2)})
    assert h1.digest() != h3.digest()

But I wondered if there was a better way to test this class. My small one-off tests weren’t addressing the full range of possibilities. I could read the code and feel confident, but wouldn’t a more comprehensive test be better? This is a pure function: inputs map to outputs with no side-effects or other interactions. It should be very testable.

This seemed like a good candidate for property-based testing. The Hypothesis library would let me generate data, and I could check that the desired properties of the hash held true.

It took me a while to get the Hypothesis strategies wired up correctly. I ended up with this, but there might be a simpler way:

from hypothesis import strategies as st

scalar_types = [
    st.none(),
    st.booleans(),
    st.integers(),
    st.floats(allow_infinity=False, allow_nan=False),
    st.text(),
    st.binary(),
]

scalars = st.one_of(*scalar_types)

def tuples_of(strat):
    return st.lists(strat, max_size=3).map(tuple)

hashable_types = scalar_types + [tuples_of(s) for s in scalar_types]

# Homogeneous sets: all elements same type.
homogeneous_sets = (
    st.sampled_from(hashable_types)
    .flatmap(lambda s: st.sets(s, max_size=5))
)

# Full nested Python data.
python_data = st.recursive(
    scalars,
    lambda children: (
        st.lists(children, max_size=5)
        | tuples_of(children)
        | homogeneous_sets
        | st.dictionaries(st.text(), children, max_size=5)
    ),
    max_leaves=10,
)

This doesn’t make completely arbitrary nested Python data: sets are forced to have elements all of the same type or I wouldn’t be able to sort them. Dictionaries only have strings for keys. But this works to generate data similar to the real data we hash. I wrote this simple test:

from hypothesis import given

@given(python_data)
def test_one(data):
    # Hashing the same thing twice.
    h1 = Hasher()
    h1.update(data)
    h2 = Hasher()
    h2.update(data)
    assert h1.digest() == h2.digest()

This didn’t find any failures, but this is the easy test: hashing the same thing twice produces equal hashes. The trickier test is to get two different data structures, and check that their equality matches their hash equality:

@given(python_data, python_data)
def test_two(data1, data2):
    h1 = Hasher()
    h1.update(data1)
    h2 = Hasher()
    h2.update(data2)

    if data1 == data2:
        assert h1.digest() == h2.digest()
    else:
        assert h1.digest() != h2.digest()

This immediately found problems, but not in my code:

> assert h1.digest() == h2.digest()
E AssertionError: assert b'\x80\x15\xc9\x05...' == b'\x9ap\xebD...'
E
E   At index 0 diff: b'\x80' != b'\x9a'
E
E   Full diff:
E   - (b'\x9ap\xebD...)'
E   + (b'\x80\x15\xc9\x05...)'
E Falsifying example: test_two(
E     data1=(False, False, False),
E     data2=(False, False, 0),
E )

Hypothesis found that (False, False, False) is equal to (False, False, 0), but they hash differently. This is correct. The Hasher class takes the types of the values into account in the hash. False and 0 are equal, but they are different types, so they hash differently. The same problem shows up for 0 == 0.0 and 0.0 == -0.0. The theory of my test was incorrect: some values that are equal should hash differently.

In my real code, this isn’t an issue. I won’t ever be comparing values like this to each other. If I had a schema for the data I would be comparing, I could use it to steer Hypothesis to generate realistic data. But I don’t have that schema, and I’m not sure I want to maintain that schema. This Hasher is useful as it is, and I’ve been able to reuse it in new ways without having to update a schema.

I could write a smarter equality check for use in the tests, but that would roughly approximate the code in Hasher itself. Duplicating product code in the tests is a good way to write tests that pass but don’t tell you anything useful.

I could exclude bools and floats from the test data, but those are actual values I need to handle correctly.

Hypothesis was useful in that it didn’t find any failures others than the ones I described. I can’t leave those tests in the automated test suite because I don’t want to manually examine the failures, but at least this gave me more confidence that the code is good as it is now.

Testing is a challenge unto itself. This brought it home to me again. It’s not easy to know precisely what you want code to do, and it’s not easy to capture that intent in tests. For now, I’m leaving just the simple tests. If anyone has ideas about how to test Hasher more thoroughly, I’m all ears.

December 18, 2025 10:30 AM UTC


Eli Bendersky

Plugins case study: mdBook preprocessors

mdBook is a tool for easily creating books out of Markdown files. It's very popular in the Rust ecosystem, where it's used (among other things) to publish the official Rust book.

mdBook has a simple yet effective plugin mechanism that can be used to modify the book output in arbitrary …

December 18, 2025 10:10 AM UTC


Peter Bengtsson

Autocomplete using PostgreSQL instead of Elasticsearch

Here on my blog I have a site search. Before you search, there's autocomplete. The autocomplete is solved by using downshift in React and on the backend, there's an API /api/v1/typeahead?q=bla. Up until today, that backend was powered by Elasticsearch. Now it's powered by PostgreSQL. Here's how I implemented it.

Indexing

A cron job loops over all titles in all blog posts and finds portions of the words in the titles as singles, doubles, and triples. For each one, the popularity of the blog post is accumulated to the extracted keywords and combos.

These are then inserted into a Django ORM model that looks like this:


class SearchTerm(models.Model):
    term = models.CharField(max_length=100, db_index=True)
    popularity = models.FloatField(default=0.0)
    add_date = models.DateTimeField(auto_now=True)
    index_version = models.IntegerField(default=0)

    class Meta:
        unique_together = ("term", "index_version")
        indexes = [
            GinIndex(
                name="plog_searchterm_term_gin_idx",
                fields=["term"],
                opclasses=["gin_trgm_ops"],
            ),
        ]

The index_version is used like this, in the indexing code:


current_index_version = (
    SearchTerm.objects.aggregate(Max("index_version"))["index_version__max"]
    or 0
)
index_version = current_index_version + 1

...

SearchTerm.objects.bulk_create(bulk)

SearchTerm.objects.filter(index_version__lt=index_version).delete()

That means that I don't have to delete previous entries until new ones have been created. So if something goes wrong during the indexing, it doesn't break the API.
Essentially, there are about 13k entries in that model. For a very brief moment there are 2x13k entries and then back to 13k entries when the whole task is done.

The search is done with the LIKE operator.


peterbecom=# select term from plog_searchterm where term like 'za%';
            term
-----------------------------
 zahid
 zappa
 zappa biography
 zappa biography barry
 zappa biography barry miles
 zappa blog
(6 rows)

In Python, it's as simple as:


base_qs = SearchTerm.objects.all()
qs = base_qa.filter(term__startswith=term.lower())

But suppose someone searches for bio we want it to match things like frank zappa biography so what it actually does is:


from django.db.models import Q 

qs = base_qs.filter(
    Q(term__startswith=term.lower()) | Q(term__contains=f" {term.lower()}")
)

Typo tolerance

This is done with the % operator.


peterbecom=# select term from plog_searchterm where term % 'frenk';
  term
--------
 free
 frank
 freeze
 french
(4 rows)

In the Django ORM it looks like this:


base_qs = SearchTerm.objects.all()
qs = base_qs.filter(term__trigram_similar=term.lower())

And if that doesn't work, it gets even more desperate. It does this using the similarity() function. Looks like this in SQL:


peterbecom=# select term from plog_searchterm where similarity(term, 'zuppa') > 0.14;
       term
-------------------
 frank zappa
 zappa
 zappa biography
 radio frank zappa
 frank zappa blog
 zappa blog
 zurich
(7 rows)

Note on typo tolerance

Most of the time, the most basic query works and yields results. I.e. the .filter(term__startswith=term.lower()) query.
It's only if it yields fewer results than the pagination size. That's why the fault tolerance query is only-if-needed. This means, it might send 2 SQL select queries from Python to PostgreSQL. In Elasticsearch, you usually don't do this. You send multiple queries and boost the differently.

It can be done with PostgreSQL too using an UNION operator so that you send one but more complex query.

Speed

It's hard to measure the true performance of these things because they're so fast that it's more about the network speed.

On my fast MacBook Pro M4, I ran about 50 realistic queries and measured the time it took each with this new PostgreSQL-based solution versus the previous Elasticsearch solution. They both take about 4ms per query. I suspect, that 90% of that 4ms is serialization & transmission, and not much time inside the database itself.

The number of rows it searches is only, at the time of writing, 13,000+ so it's hard to get a feel for how much faster Elasticsearch would be than PostgreSQL. But with a GIN index in PostgreSQL, it would have to scale much much larger to feel too slow.

About Elasticsearch

Elasticsearch is better than PostgreSQL at full-text search, including n-grams. Elasticsearch is highly optimized for these kinds of things and has powerful ways that you can make a query be a product of how well it matched with each entry's popularity. With PostgreSQL that gets difficult.

But PostgreSQL is simple. It's solid and it doesn't take up nearly as much memory as Elasticsearch.

December 18, 2025 09:46 AM UTC