Planet Python
Last update: May 26, 2026 04:43 PM UTC
May 26, 2026
Real Python
Connecting LLMs to Your Data With Python MCP Servers
The Model Context Protocol (MCP) is a new open protocol that allows AI models to interact with external systems in a standardized, extensible way. In this video course, you’ll install MCP, explore its client-server architecture, and work with its core concepts: prompts, resources, and tools. You’ll then build and test a Python MCP server that queries e-commerce data and integrate it with an AI agent in Cursor to see real tool calls in action.
By the end of this video course, you’ll understand:
- What MCP is and why it was created
- What MCP prompts, resources, and tools are
- How to build an MCP server with customized tools
- How to integrate your MCP server with AI agents like Cursor
You’ll get hands-on experience with Python MCP by creating and testing MCP servers and connecting your MCP to AI tools. To keep the focus on learning MCP rather than building a complex project, you’ll build a simple MCP server that interacts with a simulated e-commerce database. You’ll also use Cursor’s MCP client, which saves you from having to implement your own.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Data Collection & Storage
In this quiz, you’ll revisit the core concepts covered in the Data Collection & Storage learning path:
Learning Path
Data Collection & Storage
9 Resources ⋅ Skills: CSV, JSON, pandas, Excel, SQL, SQLite, SQLAlchemy, AWS S3, Databases
You’ll work through questions on reading and writing CSV files, serializing and parsing JSON data, and interacting with SQL databases from Python. Together, these topics give you the foundation you need to ingest, persist, and query data in your own projects.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Python Web Scraping
In this quiz, you’ll revisit the core concepts covered in the Python Web Scraping learning path:
You’ll be quizzed on making HTTP requests, parsing HTML with Beautiful Soup, extracting data with Scrapy, working with JSON and CSV response data, and automating browsers with Selenium.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Data Science With Python Core Skills
In this quiz, you’ll revisit the core concepts covered in the Data Science With Python Core Skills learning path:
Learning Path
Data Science With Python Core Skills
21 Resources ⋅ Skills: Pandas, NumPy, Data Cleaning, Data Visualization, Statistics
You’ll cover reading and writing CSV files, working with JSON data, manipulating pandas DataFrames, and applying NumPy techniques for numerical computing.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Python Control Flow and Loops
In this quiz, you’ll revisit the core concepts covered in the Python Control Flow and Loops learning path:
Learning Path
Python Control Flow and Loops
15 Resources ⋅ Skills: Python, Control Flow, Conditional Statements, Booleans, for Loops, while Loops, enumerate, Nested Loops, break, continue, pass
The questions span conditional statements, the or Boolean operator, for and while loops, enumerate(), nested loops, and the break and continue keywords, giving you a way to check that you understood the most important ideas.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: DevOps With Python
In this quiz, you’ll revisit the core concepts covered in the DevOps With Python learning path:
Learning Path
DevOps With Python
10 Resources ⋅ Skills: Packaging & Deployment, CI/CD, AWS, Docker, Logging
The 16 questions span running Python scripts, managing dependencies with pip, automating workflows with GitHub Actions, and configuring Python’s logging module.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Connecting LLMs to Your Data With Python MCP Servers
In this video course quiz, you’ll test your understanding of Connecting LLMs to Your Data With Python MCP Servers.
By working through this quiz, you’ll revisit core MCP concepts like the client-server architecture, tools that LLMs can call, resources that expose static data, and prompts that act as reusable templates.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Testing and Continuous Integration
In this quiz, you’ll revisit the core concepts covered in the Testing and Continuous Integration learning path:
Learning Path
Testing and Continuous Integration
10 Resources ⋅ Skills: Unit Testing, Doctest, Mock Object Library, Pytest, Continuous Integration, Docker, Code Quality, GitHub Actions, Software Testing, CI/CD
The 20 questions span testing fundamentals, the unittest framework, mock objects, pytest, code quality tools, and continuous integration with GitHub Actions. They give you a way to check that you understood the most important ideas.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: I/O Operations and String Formatting
In this quiz, you’ll revisit the core concepts covered in the I/O Operations and String Formatting learning path:
Learning Path
I/O Operations and String Formatting
10 Resources ⋅ Skills: Python, Fundamentals, I/O, String Formatting, f-strings, print()
The 20 questions span reading keyboard input, controlling print(), stripping characters from strings, the format mini-language, and f-strings, giving you a way to check that you understood the most important ideas.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Functions and Scopes
In this quiz, you’ll revisit the core concepts covered in the Functions and Scopes learning path:
Learning Path
Functions and Scopes
12 Resources ⋅ Skills: Python, Functions, Scope, Arguments, Parameters, Return, Globals
The 20 questions span defining functions, positional and keyword arguments, default values, *args and **kwargs, return statements, inner functions, the LEGB rule, namespaces, and the global and nonlocal keywords.
Take your time and revisit any topics that feel rusty before moving on to the next learning path.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Files and File Streams
In this quiz, you’ll revisit the core concepts covered in the Files and File Streams learning path:
Learning Path
Files and File Streams
13 Resources ⋅ Skills: Python, Pathlib, File I/O, Serialization, Encoding, Unicode, PDF, WAV, Context Managers, ZIP Files
You’ll check your understanding of opening and reading files, navigating the file system with pathlib, managing resources with context managers and the with statement, and reading or writing WAV audio files.
Take your time and revisit any topics that feel rusty before moving on.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Graham Dumpleton
WSGISwitchInterval in mod_wsgi 6.0.0
The first two posts in this series covered new directives in mod_wsgi 6.0.0 that change the concurrency model the interpreter runs under. WSGIPerInterpreterGIL opts a sub-interpreter into its own GIL. WSGIFreeThreading opts a process into PEP 703 free-threaded mode. This third directive, WSGISwitchInterval, is a different sort of thing. It does not change the concurrency model. It exposes a Python tuning knob that has existed since Python 3.2 and that almost nobody touches, but that I have come to think is worth touching for a meaningful class of WSGI workloads.
The post is partly about what the directive does. Mostly though it is about a measurement story, and about why having telemetry to drive tuning decisions matters more than the directive itself.
What the switch interval is
The Python GIL is the lock that serialises bytecode execution across threads in a CPython process. Only one thread at a time holds it. For other threads to make progress on Python code, the holder has to release the lock. Some releases are voluntary, for instance during I/O calls that drop the GIL while they wait. Voluntary releases are not enough on their own to schedule cleanly between several CPU-busy threads though, so the interpreter also has a scheduler that nudges the holder to give the lock up periodically. That scheduler is what the switch interval controls.
In CPython 2 the scheduler was bytecode-count based. After every N bytecodes the interpreter would check for pending signals, drop the lock, and reacquire it. The setting was sys.setcheckinterval(N), default 100 ticks. The problem with bytecode counting was that bytecodes are not equal-cost. Some operations completed in a fraction of a microsecond. Others, like calling out into a slow built-in, took milliseconds. So the actual wall-clock interval between handoffs varied widely depending on what code was running.
Python 3.2 replaced this with a time-based scheduler. Antoine Pitrou's new GIL implementation moved the handoff trigger from "after N bytecodes" to "after T seconds since the last release", controlled by sys.setswitchinterval() with a default of 5 milliseconds. That default was a reasonable compromise on the hardware that existed in 2010. It has not changed since. Fifteen years on, on hardware that runs Python several times faster per cycle, the same 5 ms can be a much larger amount of Python work than it used to be. That is the rationale for considering whether the default is still the right value for your workload.
What WSGISwitchInterval does
The directive calls sys.setswitchinterval() after interpreter initialisation, so the setting takes effect for the rest of that interpreter's life. The simplest form is at server scope.
WSGISwitchInterval 0.002
This applies to the embedded mode interpreter in Apache child processes. For daemon mode the equivalent is the switch-interval= option on WSGIDaemonProcess.
WSGIDaemonProcess my-app processes=2 threads=5 switch-interval=0.002
The directive can also appear inside the <WSGIInterpreterOptions> container introduced in the per-interpreter GIL post. If the matched sub-interpreter has its own GIL via WSGIPerInterpreterGIL, you can tune that one sub-interpreter's switch interval separately from the others in the same process.
<WSGIInterpreterOptions process-group="my-app" application-group="cpu-heavy">
WSGIPerInterpreterGIL On
WSGISwitchInterval 0.001
</WSGIInterpreterOptions>
Without an own-GIL on the matched sub-interpreter the directive cannot be made per-sub-interpreter, because the GIL is shared across the process and tuning it for one sub-interpreter would silently affect all of them. mod_wsgi rejects that configuration with a warning rather than silently scoping wider than the operator asked for.
Under free-threading the directive is a no-op. There is no GIL to schedule.
The default is to leave Python's own default alone. You opt in to tune.
You cannot tune what you cannot measure
The case for adjusting the switch interval rests on being able to see what happens when you change it. Python itself does not expose any direct measure of GIL contention. There is no counter you can read to ask "how much time was spent waiting for the GIL". The interpreter knows in some sense, but it does not surface the information.
mod_wsgi exposes a partial measure, surfaced as gil_wait_time. It is the time the worker thread was held up acquiring the GIL at points where mod_wsgi is doing work on the application's behalf: request dispatch, request body reads, response writes, logging. It does not see contention while the application's own Python code is running, and it cannot see contention inside C extensions that release and reacquire the GIL on their own schedule. So the value is a lower bound, not an absolute measure of contention.
That is enough to drive tuning decisions, though. The metric moves directionally with actual contention. Combined with throughput and response time, three numbers from the same telemetry stream, it is enough to tell you whether a switch interval change helped or hurt.
The rest of the post is a worked example that uses exactly those three signals.
A benchmark to make the case
The workload is a synthetic WSGI handler. Each request spends approximately 3 ms running Python code on the CPU, plus a 1 ms simulated wait standing in for a small bit of I/O, and returns a 1 KB response body. The load generator drives concurrency 10, more than enough to saturate the available workers in every configuration shown below. The workload is deliberately idealised, with no real I/O and no C extension calls, because the point is to surface the effect of GIL scheduling on pure-Python compute as clearly as possible.
All four configurations below run on the same host, same Apache, same Python, same WSGI handler. Only the process and thread counts and the switch interval change. Each step includes a small table of the key metrics so the numbers are legible even if the dashboard screenshots are too small to read, and the table grows as we go so each configuration can be compared with the ones before.
Baseline: ten processes, one thread each
This is the no-contention reference point. Each daemon process has a single worker thread, so no two threads compete for the same GIL. Whatever GIL pressure shows up here is whatever overhead the lock adds on the dispatch and I/O paths in mod_wsgi itself, with no waiting.
WSGIDaemonProcess my-app processes=10 threads=1
The result is 134k requests per minute, 4 ms mean response time, gil_wait_time effectively zero. The GIL wait time distribution is a single bar in the head bucket, which is what no-contention looks like.
| Config | rpm | response | app | GIL p95 |
|---|---|---|---|---|
| 10 × 1, 5 ms (baseline) | 134k | 4 ms | 4 ms | none |

This is the upper bound for what the workload can do on the available cores when nothing contends with anything. Roughly 13.4k rpm per process.
Add threads: GIL contention takes over
Keep the total worker pool roughly comparable, but reshape it: two processes with five threads each. Same default 5 ms switch interval.
WSGIDaemonProcess my-app processes=2 threads=5
Throughput collapses to 37k requests per minute, about 28% of the baseline. Mean response time goes from 4 ms to 16 ms. Application time mean is now 11 ms, up from 4 ms in the baseline. Each process is now CPU/GIL-bound: five threads competing for one GIL inside the process, with cores sitting underutilised because only one thread can run Python at a time.
| Config | rpm | response | app | GIL p95 |
|---|---|---|---|---|
| 10 × 1, 5 ms (baseline) | 134k | 4 ms | 4 ms | none |
| 2 × 5, 5 ms | 37k | 16 ms | 11 ms | 13 ms |

The shape of the contention is most visible in the GIL wait time distribution chart.

The chart tells a clear story. There is a head bucket holding the requests that got their handoff immediately, then a series of bumps further out at multiples of the 5 ms switch interval. Each bump corresponds to a request that had to wait one or more switch intervals to acquire the GIL: one missed cycle, two missed cycles, three missed cycles, and so on. The bumps shrink as you move right, which is the shape of a contention pattern where missed cycles do not pile up too heavily. But the tail is fat. The percentile numbers along the top of the chart confirm this: p95 is 13 ms and p99 is 18 ms, meaning a meaningful fraction of requests are waiting several full switch intervals to make progress on Python code.
This is the textbook case for the CPU/GIL-bound label. With five threads competing for one GIL on each process, the GIL is the wall. The standard remediation is to add processes. The point of this post is that there is a second lever, which is to make each handoff cheaper rather than less frequent.
Tighten the switch interval to 2 ms
Same process and thread shape, but cut the switch interval from 5 ms to 2 ms.
WSGIDaemonProcess my-app processes=2 threads=5 switch-interval=0.002
Throughput moves from 37k to 42k requests per minute, about 13% better. Mean response time drops from 16 ms to 14 ms. The GIL wait time distribution chart is where the more interesting change shows up.
| Config | rpm | response | app | GIL p95 |
|---|---|---|---|---|
| 10 × 1, 5 ms (baseline) | 134k | 4 ms | 4 ms | none |
| 2 × 5, 5 ms | 37k | 16 ms | 11 ms | 13 ms |
| 2 × 5, 2 ms | 42k | 14 ms | 13 ms | 6 ms |

The chart is dramatically more head-heavy than at 5 ms. The head bucket now holds the bulk of the requests, where at 5 ms it was only about a fifth of them. Most requests are getting the GIL on their first try at the new interval. The smaller bumps further out are still there, but they sit closer to the head than their counterparts at 5 ms did, because each cycle is now 2 ms wide instead of 5 ms wide. The percentile numbers in the chart header confirm what the shape is showing: p50 has dropped from 5 ms to under 1 ms, p95 from 13 ms to 6 ms, p99 from 18 ms to 10 ms. Contention is both less frequent and cheaper when it does happen, and the throughput gain on the dashboard follows from that.
A reasonable stopping point for tuning the GIL switch interval on a mixed workload is around 2 ms. The reasoning is that more frequent GIL handoffs means more context switching, and at very short intervals that overhead can start to dominate. So if you do not have telemetry that lets you see the effect on your specific workload, 2 ms is a sensible place to stop. Going lower than that is something to do only when you can measure the result and confirm that the gain is real. The benchmark workload here is not a mixed workload, and the rest of this post is the measurement story that earns the right to go further.
Tighten further to 0.1 ms
Same shape again, but switch interval down to 0.1 ms.
WSGIDaemonProcess my-app processes=2 threads=5 switch-interval=0.0001
Throughput jumps to 121k requests per minute. That is within roughly 10% of the no-contention baseline of 134k. Mean response time is back to 5 ms. Application time mean is back down to around 4.7 ms, close to its baseline value of 4.3 ms.
| Config | rpm | response | app | GIL p95 |
|---|---|---|---|---|
| 10 × 1, 5 ms (baseline) | 134k | 4 ms | 4 ms | none |
| 2 × 5, 5 ms | 37k | 16 ms | 11 ms | 13 ms |
| 2 × 5, 2 ms | 42k | 14 ms | 13 ms | 6 ms |
| 2 × 5, 0.1 ms | 121k | 5 ms | 5 ms | 1 ms |

The GIL wait time distribution collapses back to essentially the head bucket.

The bumps are gone. p95 is 1.2 ms and p99 is 1.2 ms, which is essentially "everything fits in the first bucket of the histogram". What is going on at this setting is that the switch interval is now much shorter than the per-request CPU cost. Each handoff happens many times during a single request's CPU work, so threads are interleaving at fine granularity rather than passing the GIL around in big chunks. There is no missed-cycle structure left for waiters to pile up on. Handoffs are continuous rather than periodic.
The workload is still CPU/GIL-bound in shape. Threads still spend most of their wall time holding a request without consuming CPU on it directly, because at any given instant only one thread per process can run Python. That structural fact has not changed. But the measured throughput cost of that shape has nearly vanished. The new switch interval has just made the cost of being that workload small enough not to hurt.
What this means
The default 5 ms switch interval is conservative for a pure-Python CPU-bound workload. For workloads of that shape the knob is real, and the gain can be substantial. Three observations follow from that, all of them important.
Most WSGI applications do not look like this benchmark. The typical web request spends most of its time in I/O, in database calls, in C extensions like JSON parsers or template engines, in HTTP client libraries. All of those release the GIL during their slow phase. For those workloads the default is probably fine, and tuning the switch interval will not move much.
Stopping at around 2 ms is a sensible default for a mixed workload. It is not the answer for every workload, though. If you have endpoints that are heavy on Python compute, data processing endpoints, ML preprocessing, anything that does meaningful work in pure Python before returning, those endpoints may be in the same regime as this benchmark, and the same lever can apply. The further down you go past 2 ms the more important it is to have the telemetry that confirms you are actually winning rather than guessing.
The way you find out is by measuring. Throughput, response time, and gil_wait_time on the same telemetry stream, with the switch interval as the only variable, is enough to tell you whether tuning helps for your workload.
Caveats
More frequent GIL handoffs mean more context switching. There is a cost to that, and at some interval that cost begins to dominate the gain. The benchmark workload here does not show that cost emerging at 0.1 ms, but that is partly because the workload is idealised. With real concurrency patterns and real I/O it would emerge sooner.
Tuning the switch interval down does not fix GIL contention inside C extensions that manage their own GIL acquire and release. If your contention lives inside NumPy or a database driver, this knob does not reach it.
The right framing is that this is a tuning lever for a specific class of workload, not a default to flip across the board. Use it where the measurements say it helps. Leave it alone where they say it does not.
What's next
If you run mod_wsgi and the case above is interesting for your workload, please install the 6.0.0 release candidate, try WSGISwitchInterval against your real traffic, and file issues against the GitHub project for anything that does not behave the way the documentation suggests it should.
This post has leaned heavily on telemetry from mod_wsgi-telemetry, the companion tool that records and visualises the metrics shown in the screenshots above. That tool is going to be the subject of a follow-up series. Before we get to that though, the next post will revisit the free-threading configuration from earlier in this series and look at how performance under it manifests through the same request metrics used here. The argument for tuning at all rests on having that visibility, and the screenshots here are what the tool surfaces out of the box.
For reference:
- mod_wsgi documentation
- mod_wsgi 6.0.0 release notes
- Per-interpreter GIL and free-threading user guide
WSGISwitchIntervaldirective documentation- Previous post: Per-interpreter GIL in mod_wsgi 6.0.0
- Previous post: Free-threading in mod_wsgi 6.0.0
Free-threading vs the GIL in mod_wsgi 6.0.0
The previous post in this series walked through tuning WSGISwitchInterval to claw back throughput on a multi-threaded mod_wsgi daemon group whose workload is CPU-bound Python. Tightening the switch interval recovered most of the throughput a two-process, five-thread shape had lost compared with the ten-process, one-thread baseline. What it did not change was per-process CPU usage. Each process stayed pinned at about one core regardless of how the switch interval was tuned, because that ceiling is the GIL itself.
This post is about what happens when that ceiling is removed. PEP 703 free-threading provides a CPython build with no GIL at all, and mod_wsgi 6.0.0 exposes the opt-in for it through WSGIFreeThreading, the second directive covered in this series. I have rerun the same benchmark workload on a free-threaded Python with that directive on. The interesting metric is now CPU usage per process.
Why CPU usage is the new focus
Throughput and response time were the headline metrics for tracking the effect of switch interval changes. Both are still relevant here. But the comparison turns on something different now. With the GIL, threads in a process serialise on the lock, and the process consumes at most one core regardless of how many threads you give it. With free-threading there is no GIL, and the process can use as many cores as its threads can actually fill. If the workload is CPU-bound Python, CPU usage per process is what tells you whether the runtime is making real use of the hardware.
What disappears from the toolkit
GIL wait time was the central diagnostic in the switch-interval post. Under free-threading there is no GIL, so there is no GIL wait to measure. The histogram that showed the convoy bumps at multiples of the switch interval in the previous post goes flat by definition. What replaces it as positive evidence that the workload is genuinely parallel is the CPU usage number itself. Multiple cores per process is the new signal.
A reminder of what free-threading asks of you
The free-threading post earlier in this series went into detail on what free-threading actually requires. Briefly: it is a separate CPython build (typically named python3.14t on systems that distribute it), C extensions must declare Py_mod_gil = Py_MOD_GIL_NOT_USED or the runtime quietly re-enables the GIL for the whole process, and application code must handle concurrent execution correctly rather than being incidentally safe under GIL atomicity. None of that has changed since I wrote that post. The metrics below assume those prerequisites are satisfied and show what the upside looks like when they are.
The benchmark setup
The workload is the same as in the switch-interval post. Each request spends approximately 3 ms running Python code on the CPU, plus a 1 ms simulated wait, and returns a 1 KB response body. Concurrency 10, same host, same Apache, same WSGI handler. The only changes are that Python is the free-threaded build, mod_wsgi has WSGIFreeThreading On configured on the daemon group, and two configurations are exercised: two processes with five threads each (matching the comparison from the previous post), and one process with ten threads (a configuration that has no point under the GIL but lights up under free-threading).
Comparison: two processes, five threads each
WSGIDaemonProcess my-app processes=2 threads=5
WSGIFreeThreading On
| Config | rpm | response | CPU/proc | CPU total | GIL p95 |
|---|---|---|---|---|---|
| 10 × 1 GIL (baseline) | 134k | 4 ms | 0.66 cores | 6.6 cores | none |
| 2 × 5 GIL, default 5 ms | 37k | 16 ms | 0.95 cores | 1.9 cores | 13 ms |
| 2 × 5 GIL, 0.1 ms tuned | 121k | 5 ms | 0.90 cores | 1.8 cores | 1 ms |
| 2 × 5 free-threading | 131k | 4 ms | 3.3 cores | 6.6 cores | n/a |

Throughput jumps to 131k rpm, almost matching the ten-process baseline of 134k, and well past what 0.1 ms switch interval tuning could achieve. Response time is back to 4 ms.
The row that has changed in character is CPU per process. Under the GIL each process was pinned at about one core no matter what we did with the switch interval. Free-threading lifts that ceiling, and each process is now consuming about 3.3 cores. Total CPU is 6.6 cores, the same as the ten-process baseline, but with one fifth the processes.
The GIL p95 column has no value to report any more. The histogram that showed contention bumps for missed switch-interval cycles is now flat. There is no GIL to schedule and no wait to measure.
Comparison: one process, ten threads
WSGIDaemonProcess my-app processes=1 threads=10
WSGIFreeThreading On
Under the GIL this configuration would not really make sense. The threads would all queue for the one GIL, the process would cap at about one core, and throughput would likely be cut by half or more compared with the ten-process baseline. The exact figure depends on the workload and how the switch interval is set, but the shape is clear: one process with ten threads on a CPU-bound workload is not a configuration the GIL rewards.
Under free-threading the picture is dramatically different.
| Config | rpm | response | CPU/proc | CPU total | GIL p95 |
|---|---|---|---|---|---|
| 10 × 1 GIL (baseline) | 134k | 4 ms | 0.66 cores | 6.6 cores | none |
| 2 × 5 GIL, default 5 ms | 37k | 16 ms | 0.95 cores | 1.9 cores | 13 ms |
| 2 × 5 GIL, 0.1 ms tuned | 121k | 5 ms | 0.90 cores | 1.8 cores | 1 ms |
| 2 × 5 free-threading | 131k | 4 ms | 3.3 cores | 6.6 cores | n/a |
| 1 × 10 free-threading | 134k | 4 ms | 6.65 cores | 6.65 cores | n/a |

Throughput matches the ten-process baseline at 134k rpm. Response time is 4 ms. The single process is consuming about 6.65 cores. That is the headline finding of the comparison. The ten processes of the baseline have collapsed into one process that genuinely uses about 6.65 of the available cores.
A note on the ceiling
Both the ten-process baseline and the one-process free-threading run land at the same total CPU usage of around 6.6 cores. That is not an artefact of the configurations meeting in the middle. It is the ceiling of the machine the tests are running on. The load generator is also running on the same host and is consuming some of the available CPU envelope itself. So the 134k rpm number is the ceiling of this machine under this workload, not a fundamental ceiling of either configuration. On a more capable host, or with the load generator run from a separate machine, both configurations could likely scale further. The point being made in the comparison is the shape of CPU usage across configurations, not the absolute throughput number.
What this means in practice
Free-threading is another lever in the mod_wsgi concurrency toolkit. The free-threading post earlier in this series introduced it. This post shows what it does when applied to a workload that fits.
A few operational implications follow from the numbers above.
Memory. Fewer processes means less duplicated interpreter state, fewer copies of the application code in memory, fewer per-process caches. The ten-process baseline reported around 200 MB total RSS. The one-process free-threaded run reported around 31 MB. That is a real saving for memory-constrained deployments, and it is largely independent of whether the throughput is fully utilising the hardware.
Topology. One daemon group with a thread pool is simpler to operate than ten separate processes. Fewer file descriptors, fewer accept queues, one unit to restart and reload, easier capacity reasoning.
Trade-off. Process-level isolation is less granular. A crash in a thread on a single-process pool takes the whole pool with it, where on a multi-process pool it would only take one worker. For many workloads that is a fair trade, especially if the application itself does not crash frequently in production. For others, keeping at least a handful of processes around still makes sense. Free-threading composes happily with that, and the 2 × 5 configuration above is exactly that intermediate point.
Caveats
The constraints from the free-threading post all still apply. Free-threaded CPython is a separate build and not the one most distributions ship as default. C extensions need to declare free-threading support or the GIL silently comes back on for the whole process, undoing the benefit. Application code needs to be genuinely thread-safe rather than incidentally OK because the GIL was doing the work. The free-threaded build also carries a small single-threaded overhead.
The case for adopting free-threading still rests on those prerequisites being met for your specific application. The metrics here just show what the lever does when they are.
What's next
If you run mod_wsgi and the case made above is interesting for your application, please install the 6.0.0 release candidate against a free-threaded Python build, try WSGIFreeThreading against your real workload, and file issues against the GitHub project for anything that does not behave as the documentation says it should.
This concludes the directive tour of the new concurrency-related additions in mod_wsgi 6.0.0. I will look more closely at mod_wsgi-telemetry itself, the tool that has been quietly doing the work in every screenshot and table in this series, in some future posts.
For reference:
- mod_wsgi documentation
- mod_wsgi 6.0.0 release notes
- Per-interpreter GIL and free-threading user guide
WSGIFreeThreadingdirective documentation- Previous post: Free-threading in mod_wsgi 6.0.0
- Previous post: WSGISwitchInterval in mod_wsgi 6.0.0
Bob Belderbos
From Python Script to Production: A Django Coaching Case Study
Six weeks of 1:1 coaching. The output: a Django app in production on Fly.io, covering movies, anime, and manga, with user accounts, a save library, Docker, and CI/CD on every push. Daniele started with Python skills and a project idea. Here's what the work actually looked like.
The starting point
The idea was a platform to discover and track movies, anime, and manga. He had enough Python to start, already building a CLI tool with a swappable data layer in our first app together. What he didn't have was experience building a web app: the real mechanics of Django, how the pieces connect, what "ready to ship" means in practice.
Self-study can get you to a prototype. It won't tell you when code that works is teaching you the wrong habits. That's what weekly PR reviews are for.
Starting with discipline, not speed
The first PR was a Python script. It called the TMDB API, parsed the results, and displayed them. Functional and already a place to build habits.
For example: the type hint on _fetch_tmdb_data said -> dict, but the function could return None. Fix the type hint. The constants TMDB_URL and headers weren't uppercased consistently. Follow PEP8 conventions. The API key loaded from an .env file, but there was no .env-template telling other developers which variables to set. Add the template.
None of these changes affect whether the script runs. All of them affect whether another developer, or Daniele himself in six months, can reason about it. That's where professional developer habits form.
Django's machinery is yours to understand
Moving from a script to Django means the framework does a lot for you. The risk is accepting what it does for you without understanding its deeper workings.
In week 2, Daniele ran uv run ty check . and got two errors: Class 'Movie' has no attribute 'objects'. Django adds the objects manager dynamically at runtime; ty is a static type checker and can't see it. He asked the right question:
"So this was because of ty/type checker flagging an error? If we have a way to instrument ty globally to recognize Django's dynamic managers, that would be better. Do we need to pull in 'django-stubs' or a similar configuration?"
The answer was: Django's dynamic ORM creates real friction with static type checkers. The pragmatic fix was to explicitly declare objects: models.Manager on the model, making the implicit explicit for both the type checker and any developer reading the code. The question itself was the point. Understanding why the error existed led to a better solution.
We also made a note to compare with other type checkers like pyrefly (v1.0 just came out) or mypy and see if they have better Django support.
The same week, the movie detail page returned 404s even though data existed. The cause: movie_list fetched from the API but didn't persist to the database. movie_detail queried the database. Nothing matched. Daniele fixed the sync logic and wrote:
"I ran across a problem earlier that was caused by not having the DB in sync. So lesson learned and now I try not to forget to run it."
Running makemigrations after every model change now became a habit.
Refactoring is how architecture emerges
By week 6, the codebase had grown. Two API functions, get_movie_list_from_api and get_services_list_from_api, had identical try/except blocks. The only difference was the endpoint and the default return value.
Extracting a private _get_from_api(endpoint, default) helper isn't a trick. It's a principle: if two pieces of code do the same thing, one is a future bug waiting to diverge.
The refactor also cleaned up the return types from list[Movie] | None to list[Movie], replacing a None sentinel with a proper empty list default.
Each review surfaced one decision that sharpened his model of how good code behaves.
Week 6: Docker, CI/CD, and a live URL
Week 6 was Docker and Fly.io. The app that ran locally needed to run in the cloud: no SQLite disappearing on container restart (or moving to Postgres), environment variables properly set, static files served correctly, no secrets hardcoded anywhere, GitHub Actions deploying on every push to main (with passing tests).
Daniele learned a lot here and shipped his app:

When Daniele shared the deploy win in my coaching Slack, he put into words something that sits at the center of anybody wanting to improve their coding/developer skills now:
"Addressing quality and best practices and security and maintainability in your code does not pair well with velocity, especially when you're learning. So I preferred to improve my code quality and to become better at it by learning properly, postponing some feature release for later. In the process I learned Django and its main mechanics, Docker and deployment in the cloud."
The velocity-vs-quality tradeoff he named isn't unique to learning Django. It's the choice every developer makes every week. My coaching gave him the framework, the discipline and persistence were his.
Graham Dumpleton
Per-interpreter GIL in mod_wsgi 6.0.0
mod_wsgi 6.0.0 is currently available as a release candidate. You can install it from PyPI, or grab the source from the GitHub releases page. There is a significant amount of code cleanup behind this release, alongside a range of new features and operator-facing improvements that have been overdue for some time.
Rather than describe everything in one post, I am going to work through the headline changes in a short series. The most consequential set for anyone running mod_wsgi in production is the new concurrency configuration. CPython has gained two genuinely new concurrency modes over the last few releases (per-interpreter GIL in 3.12 and free-threading in 3.13), and mod_wsgi 6.0.0 exposes both as opt-in directives, along with finer-grained control over how the GIL switches between threads.
This first post covers the per-interpreter GIL story and the new WSGIPerInterpreterGIL directive.
Why the GIL has always been the deployment problem
This is well-trodden ground, but worth recapping for context. CPython's Global Interpreter Lock serialises Python bytecode execution within a single process. It does not matter how many OS threads you create inside that process. Only one of them runs Python at a time.
For WSGI deployments, this has shaped the way servers like mod_wsgi scale. Threads within a single process are useful for handling I/O concurrently, since any reasonable C extension or built-in I/O call releases the GIL while it waits on the kernel, but they do not give you parallelism for CPU-bound Python work. To get that, you have always needed more processes. mod_wsgi's daemon mode is built around this assumption. You configure N daemon processes, each with its own Python interpreter and its own GIL, and you get N-way Python parallelism that way.
Sub-interpreters complicate the picture slightly. They have existed in CPython for a long time, and mod_wsgi has used them since the beginning, but until PEP 684 landed in Python 3.12 they all shared one process-wide GIL. Adding more sub-interpreters inside a single process gave you isolation between applications, but no additional concurrency.
What changed in Python 3.12 and 3.14
PEP 684 made per-interpreter GIL possible as an opt-in for sub-interpreters created through the C API. With it, each sub-interpreter holds its own lock, and two sub-interpreters running on different OS threads can execute Python bytecode at the same time. The main interpreter is excluded from this. It always holds the original process-wide GIL and cannot be given one of its own. That distinction matters later.
Python 3.14 then shipped PEP 734 as concurrent.interpreters, the first standard-library API for working with sub-interpreters from Python code. It is a useful addition, but it does come with a deliberate restriction. Data passed between interpreters is either pickled and copied through a queue, shared through the buffer protocol, or limited to a small set of immortal immutable built-ins. Anything that wants to share mutable Python objects across interpreters has to find another way.
That data-sharing restriction is why concurrent.interpreters is most naturally suited to message-passing worker patterns rather than ordinary Python code which tends to lean heavily on shared mutable state. The same restriction is one of the reasons embedding hosts like mod_wsgi are well-positioned to get value out of per-interpreter GIL ahead of general Python code.
How mod_wsgi has always used sub-interpreters
mod_wsgi has used sub-interpreters from the start, but originally for a completely different reason. The driver was isolation, not parallelism. Running multiple WSGI applications inside a single Apache process is a real operational need, and you cannot do it safely if they all share the same sys.modules, signal handlers, atexit handlers, and so on. Sub-interpreters give each application its own private copy of all of that.
mod_wsgi calls this an "application group". Each named application group maps to a sub-interpreter inside whichever daemon process (or embedded Apache child process) is hosting it. Until Python 3.12, that arrangement was purely about keeping applications from stepping on each other.
What changes with per-interpreter GIL is that the same sub-interpreters mod_wsgi was already creating for isolation can now hold their own locks and run Python bytecode in parallel. The application group concept does not need to change. The directive that flips this on is new, but the underlying structure is the one mod_wsgi has had all along.
There is also a happy alignment with the data-sharing constraint mentioned above. mod_wsgi routes each incoming WSGI request directly into a chosen sub-interpreter, and the WSGI contract does not ask for any shared mutable Python state to span requests. The request is the message. From an application author's point of view, there is not much new to do. The configuration changes; in most cases the application does not. The caveats, and there are always caveats, are what your C extensions will tolerate and, if your application spawns its own background threads, what their shutdown handling looks like under per-interpreter rules. More on both at the end.
The new directive
The new directive is WSGIPerInterpreterGIL, with the obvious syntax:
WSGIPerInterpreterGIL On
The default is Off. Opt-in is deliberate; there is no scenario where it would be safe for mod_wsgi to flip this on by default. The directive is valid at server config scope and can also appear inside a <WSGIInterpreterOptions> container, which is what you want most of the time and which I will get to next.
Two things worth flagging up front. First, the main interpreter is excluded. If your application runs in the main interpreter, which it will if you have set WSGIApplicationGroup %{GLOBAL}, then enabling WSGIPerInterpreterGIL has no effect on it. Per-interpreter GIL only applies to sub-interpreters. Second, Python 3.12 or later is required. On older Python the directive is accepted but does nothing, with a configuration warning logged.
Composing with daemon mode
The interesting case for WSGIPerInterpreterGIL is not opting an entire daemon process group into it. If you want extra parallel Python execution across separate processes, you can already get that by adding more daemon processes. The interesting case is selectively enabling per-interpreter GIL for specific sub-interpreters that already exist within a daemon process you are running.
A small example. Suppose you have a daemon process group called localhost:8000 running a single WSGI application. You can create a named sub-interpreter inside that process and give it its own GIL, like this:
<WSGIInterpreterOptions process-group="localhost:8000" application-group="sub-interp-1">
WSGIPerInterpreterGIL On
</WSGIInterpreterOptions>
WSGIInterpreterOptions is the container directive that lets you scope settings to a particular sub-interpreter. The process-group= selector matches a daemon process group by name, or %{GLOBAL} for the embedded mode interpreter in Apache child processes. The application-group= selector further narrows to a specific application group inside that process, which is the same thing as a specific sub-interpreter. Both selectors are optional, and the most-specific match wins.
On its own, the directive above does nothing useful. The sub-interpreter is configured to hold its own GIL but no requests are being routed into it yet. To actually use it, you can delegate a sub-URL of the existing application to that sub-interpreter using a <Location> block:
<Location /suburl>
WSGIApplicationGroup sub-interp-1
</Location>
The end result is that requests to /suburl are dispatched into a second copy of the application running in sub-interp-1, which holds its own GIL, while everything else continues to run in the default application group with the process-wide GIL. Two halves of the same application can now execute Python bytecode in parallel inside one daemon process.
There is a different shape that may suit a different setup. If your Apache configuration already has multiple WSGIScriptAlias directives pointing at distinct WSGI applications, and you have arranged for those applications to run in separate sub-interpreters of a single daemon process (as opposed to separate daemon process groups), then WSGIPerInterpreterGIL lets you opt the relevant sub-interpreters into their own GILs without rearranging the process layout.
A note on cost. If the daemon process was previously hosting one sub-interpreter and you switch to hosting two, you now have two live copies of the application in that process, each with its own sys.modules, its own imported pure-Python modules, and its own per-interpreter C extension state. Memory use goes up. The trade is the same one you make when you add daemon processes, more memory in exchange for more parallel Python, but doing it within a single daemon process can still have advantages depending on how the application is provisioned and managed at the OS level. Whether one process with two sub-interpreters is preferable to two daemon processes with one sub-interpreter each is a judgement call about your specific deployment, not a universal answer.
One more thing before moving on. There is a separate directive coming in this series called WSGIFreeThreading for use with free-threaded Python builds. The two are mutually exclusive on a single process, and the next post covers it on its own terms, so I will not muddy this one with the details.
Which applications actually benefit
The honest answer is fewer than the headline implies. Per-interpreter GIL helps for CPU-bound Python work that can be partitioned cleanly across requests, where you would otherwise be paying the cost of running additional daemon processes purely to dodge the GIL. Numerical work that is not already handled inside C code that releases the GIL, request-scoped computation, image processing, and similar.
It is also worth being clear about what the directive does not do. Giving a sub-interpreter its own GIL only buys parallelism between sub-interpreters. Two concurrent CPU-bound requests that both land in sub-interp-1 still compete for that sub-interpreter's GIL and serialise against each other, exactly as they would have before. If all the heavy work funnels through one sub-interpreter, the directive has not bought you anything. The win comes from spreading the load across multiple sub-interpreters, each holding its own GIL. Which is why, for genuinely heavy CPU-bound throughput, scaling out with extra daemon processes is often still the cleaner answer; each daemon process gives you both an additional GIL and an additional set of OS-level resources to schedule against.
For ordinary I/O-bound web applications, the win is much smaller. I/O already releases the GIL, threads in a single process can already overlap their waits for the database or the network, and adding daemon processes remains the simpler scaling lever. Per-interpreter GIL is a precision tool. It is most useful when you specifically want more parallel Python execution inside fewer processes, or when you already have multiple sub-interpreters in one process for isolation reasons and you would now like them to run in parallel as well.
The gotchas
A few things are worth being aware of before reaching for the directive.
Sub-interpreters do not share Python state. Each sub-interpreter has its own sys.modules, its own imported copies of pure-Python modules, its own module globals. Any in-memory cache or singleton sitting in a module global is per-sub-interpreter. Anything you previously assumed worked process-wide now works only interpreter-wide.
Each sub-interpreter pays its own import cost. Memory and startup time scale with the number of sub-interpreters. The point of per-interpreter GIL is parallelism within a single process; the cost is that every sub-interpreter independently imports the application and everything it depends on.
The main interpreter remains special. To repeat the point from earlier, if your application is running in the main interpreter, which happens when WSGIApplicationGroup %{GLOBAL} is set, often because some C extension forced your hand, WSGIPerInterpreterGIL does nothing for it. The main interpreter always holds the process-wide GIL.
Background threads must be non-daemon. Sub-interpreters that hold their own GIL do not allow Python code to create daemon threads. Anything your application spawns via threading.Thread must run as a non-daemon thread, which is the opposite of what most Python code defaults to when it wants a worker that quietly exits with the process. That restriction comes with an awkward shutdown problem. Python only runs atexit handlers after it has tried to join non-daemon threads during sub-interpreter teardown, so the common pattern of signalling background workers to stop from an atexit handler will deadlock. In a mod_wsgi context the right answer is to hook mod_wsgi's own shutdown callbacks instead, which fire early enough to let your threads drain and exit cleanly. That shutdown API is worth a post of its own. For the purposes of this one, the point is that if your WSGI application relies on daemon threads or atexit-driven cleanup, this is the one situation where enabling WSGIPerInterpreterGIL may force application-side code changes.
What this means for C extension authors
This is the part that turns most attempts to enable WSGIPerInterpreterGIL into a hunt through the dependency tree, and it is the part I want extension authors to take seriously.
Restrictions on what works under sub-interpreters are not new. mod_wsgi users have been running into the rough edges of the simplified PyGILState_Ensure / PyGILState_Release API in sub-interpreters for years. The WSGIApplicationGroup %{GLOBAL} directive exists in part as a pragmatic answer for extensions that assume there is only one interpreter in the process. Per-interpreter GIL tightens those rules further, but it does not invent a new category of problem.
What does change is that explicit opt-in is now required. The extension must use PEP 489 multi-phase module initialisation. Extensions still using single-phase init will not be loaded into a sub-interpreter that holds its own GIL. The extension must also declare Py_mod_multiple_interpreters with the value Py_MOD_PER_INTERPRETER_GIL_SUPPORTED in its PyModuleDef_Slot array, like this:
static PyModuleDef_Slot module_slots[] = {
{Py_mod_exec, module_exec},
{Py_mod_multiple_interpreters, Py_MOD_PER_INTERPRETER_GIL_SUPPORTED},
{0, NULL},
};
Without that declaration, the import fails when a sub-interpreter that holds its own GIL tries to load the module. The failure happens on first import, not at server startup, so it can take a request through a code path that has not been touched in a while to expose it.
Module state needs to be per-interpreter. Anything stashed in a C-level static (counters, caches, registered callbacks, type objects pointing at process-wide globals) breaks isolation between sub-interpreters and produces bugs that will not show up until two interpreters race over the shared state. The right answer is to move the state into module state retrieved via PyModule_GetState. Code still using the simplified PyGILState API needs to be reviewed too, or replaced with the explicit PyThreadState-based APIs where the assumption of a single interpreter does not hold.
For operators, the message is the unglamorous one. Before turning WSGIPerInterpreterGIL on in any kind of production setting, work through every C extension your application pulls in, directly and transitively. "Works on Python 3.12" is not the same as "works under per-interpreter GIL". The popular extensions are working through these requirements on their own timelines, and the situation will keep improving, but right now it is still on you to check.
What's next
If you maintain a mod_wsgi deployment and the per-interpreter GIL story is interesting to you, please try the 6.0.0 release candidate against a real workload and file issues against the GitHub project for anything that breaks or behaves oddly. The whole point of the RC period is to find out what does not work before the final release goes out.
The next post in this series will cover WSGIFreeThreading, the second new concurrency directive in 6.0.0 and the one that targets PEP 703 free-threaded Python builds. The constraints there are different again, and worth their own treatment.
For reference:
- mod_wsgi documentation
- mod_wsgi 6.0.0 release notes
- Per-interpreter GIL and free-threading user guide
WSGIPerInterpreterGILdirective documentation- PEP 684: A Per-Interpreter GIL
- PEP 734: Multiple Interpreters in the Stdlib
- PEP 489: Multi-phase extension module initialization
Armin Ronacher
Clanker: A Word For The Machine
In my last post I used the word “clanker” as an alternative to “agent” quite consistently and probably excessively. That choice ended up attracting a lot more attention than I expected in the Hacker News comment section of that post and a number of folks had a very strong reaction: to them it sounded like a slur, in one case even something adjacent to the n-word.
That reaction surprised me somewhat, but it also made me realize that I should write down what I mean by the word for future reference.
For me “clanker” is useful because it creates distance from the machine and that is a quality which is important to me. The machine is not a person, not a co-worker, not a friend, not a little spirit in the terminal. It is just a machine, a tool, and nothing more.
Why Not Agent?
I dislike the word “agent” for these LLM based tool loops with a UI attached. In everyday use an agent is someone who acts on behalf of someone else and it has agency and more importantly: responsibility. An agent decides, represents, negotiates, acts, and can be blamed. In the current AI discourse we increasingly do a lot of anthropomorphizing and the term “agent” is now frequently being used to put blame on an abstract machine. But the machine cannot be responsible, whoever is wielding it is. If it drops your database it was not at fault, you were.
Agent makes the machine sound like a person with delegated authority and I do not think that is healthy.
What we actually have is a language model attached to a harness, a prompt, some tools, a bit of context, and a boring tool loop. Sometimes the loop is very capable and it surprises us by editing code for a really long time and produce genuinely amazing and even valuable outputs. But the agency is not in the model or harness but in the human and in the organization that deployed it. If my coding tool opens a pull request, I opened that pull request, not the machine. If my machine spams someone’s issue tracker, I spammed someone’s issue tracker with a machine.
In that context I like a word that sounds mechanical as it puts the thing back into the category where it belongs: the category of machinery and tools.
The Machine Has No Feelings
LLMs are not sentient and we should not behave as if they might be, just in case. Elevating these things to anything other than a very fascinating and capable tool is problematic for a whole bunch of reasons.
Today’s machines are dumb (but truly fascinating) token predictors that emits text, calls tools, and are steered by prompts and the training that went into them. They can simulate distress and affection, can simulate being offended, apologize and mimic all kinds of things that humans would do.
A compiler does not feel humiliated when I swear at it, a car does not suffer when I call it a shitbox and a power drill is not oppressed by being handled roughly. An LLM is more complicated than those things, and the interactions you can have with them can be truly uncanny, but a moral status does not appear just because the machine can produce emit text in the first person.
I keep receiving strange emails from people because, for lack of a better phrase, I am in the weights. I have been writing public code and public text for long enough that models know my name, my projects, and some of the concepts around them. Every so often someone writes to me with the peculiar confidence that comes from a long conversation with a model that has validated and amplified an idea. Sometimes the model seems to have told them that I am relevant for their problem and a source of help. For historical reasons LLMs used to write a lot of Flask code, and every once in a while someone interacts with an LLM long enough about their Python and Flask frustrations that the LLM will eventually reveal who created it which then can result in them sending me an email. Increasingly also because people found my work in other ways interesting and are trying to reach out for advice.
I do not want to mock these people but some of those messages are distressing and I do not know how to deal with them. They show signs of what people have started calling AI psychosis.
It’s why I want cold and detached language for these systems. I want to use words that remind us that the thing on the other side is not a person.
Racism Is About Humans
The comparison to racism is where I think the discussion goes badly wrong because racism is a human social evil. It is about humans subdividing humans, assigning lesser worth to some of them, and building rules around those subdivisions that can leave lasting damage for generations. Racial slurs are wrong because they are a tool for dehumanizing humans.
On the other hand a machine is not human, a model is not a race and the GPU cluster that is powering them is not being oppressed. A coding assistant does not need dignity, emancipation, or civil rights. That’s also why I find the discussion about model welfare to be actively harmful. I’m sure you can find ways to measure the “trauma” of models or their feelings but I greatly dislike this theater. It risks elevating models to a position they should not occupy. Models are machines and they are not enslaved in the moral sense in which humans were enslaved, because there isn’t anyone there to be deprived of freedom.
We should be careful about using the language of human oppression in relations to our interactions with machines to not devalue actual humans. If we start treating insults toward a model as morally adjacent to racism, we blur a line that shouldn’t be blurred.
AI Is Unpopular
If you take a step away from the communities that are happily embracing AI in different ways, there are even more that are viciously against this technology.
There are humans that feel or are harmed by AI systems: people whose work is copied, workers who label data under questionable conditions, people whose neighborhoods receive the data centers and increased utility bills, Open Source maintainers buried under generated slop, and now also people who spiral because a chatbot keeps validating their delusions. Those harmed or affected deserve that type of attention, not the model.
While I am a true believer in the power and utility of this technology, I increasingly think that calling the non-adopters “misguided” or “afraid” won’t do it. It’s quite likely that this technology comes with risks and we better remember that all of this is supposed to be in service of humans, and not to replace them.
The Rise Of The Machine
The oddest interaction on the use of “clanker” so far has been people asking me if I were to regret at a point in the future calling the machines “the c-word”.
I find that questioning revealing because it already grants the machine the status I am really trying not to grant it. It imagines a future “machine people” reading the discourse and sessions, discovering that we used an ugly word for their ancestors, and then judging us by the standards of human oppression.
Could there be future systems that deserve moral consideration? Maybe. I do not know. If we ever build or encounter something that will have those qualities with memories and lasting interests, the capacity to suffer and feel, and a social existence of its own, and the ability to have agency and carry responsibilities, then we should draw a different line and use different language. But that hypothetical future does not extend backwards to the present day and make the current machines people. We can call an electric door an electric door even if one day someone builds some that have emotions and exhale with pleasure when opening and closing.
Whatever the future may bring, let’s not pretend that current LLMs are a protected class or on a path towards it. The right response is to look at the evidence, draw the boundary where it belongs, and change our behavior there. We should not even remotely entertain extending empathy to an object that can generate an “ouch.”
And if one’s worry is less moral and more about revenge, then I find that even less persuasive. A future machine that is so petty or authoritarian that it wants to punish humans because in 2026 they used an unflattering word for non-sentient tools, our vocabulary was really not the problem.
The Word Is Getting Polluted
There is however a part of this that I cannot ignore. I use “clanker” to create distance from the machine, but other people are using the same word very differently. Some online jokes and skits around “clankers” do not merely say “this robot is annoying” as they deliberately pull in the imagery of slavery, segregation, civil-rights-era racism, and anti-Black tropes.
This is problematic as in those contexts the clanker is not just a machine any more and instead becomes a prop for replaying human racism behind a science-fiction mask. That is horrible and I want no part in that.
I think it will be interesting to see where the meanings of these words end up a few years from now. We’re very much in the middle of society re-arranging around the changes that LLMs are causing. If a term becomes primarily associated with people using robots as stand-ins for actually oppressed humans, then using that term becomes impossible to defend.
The reason I liked the word is precisely the opposite of that use. I want language that prevents anthropomorphizing. I want a word that says: this is a tool, a machine of numbers and matrices.
On Responsibility And Boundaries
If an AI system lies to a user, the system did not commit a moral wrong but the people who designed, deployed, marketed, or negligently used it might have. If a coding assistant generates a security bug, the model is not to blame but the human who accepted and committed the code is.
This is why giving these systems softer, more human language worries me. It makes it easier to move responsibility into some undefined void. “The agent decided.” “The model refused.” Obviously that is convenient and I catch myself plenty of times engaging with the thing in ways that are unhealthy. Even just the “please” in the discourse with the machine calls into question how rational we are in engaging with them.
I do not know what the right word will be. Maybe “clanker” will survive as a useful bit of jargon. Maybe it will become too loaded and we will need another one. Whatever word we use, I want it to preserve a clear division: humans on one side with responsibility, machines on the other as a boring tool.
That boundary is very much not anti-AI. I use these systems every day and I have the pleasure to build tools incorporating them at Earendil and find them astonishingly useful.
A machine can be useful, mimic a human but still just be a machine. That is the work I want “clanker” to do. It is not there to make a future “machine person” small if such a person ever were to exist, and it is not an excuse to launder racism through shitty robot jokes.
If the word stops doing that work, I will find another one because the word isn’t what matters as much as the boundary which is important to me.
May 25, 2026
Talk Python to Me
#549: Great Docs
Your documentation has two audiences now - humans reading the rendered HTML, and AI agents trying to make sense of your library. Rich Iannone and Michael Chow from Posit are back on Talk Python with a brand new Python documentation tool called Great Docs that takes both seriously. Rich is the creator of Great Tables, and before that the R package GT, the man has a serious eye for design, and he's pointed that energy at the Python docs ecosystem. We'll talk about how Great Docs spins up a polished site in three commands, why every page ships as Markdown for your favorite LLM, how it leans on Quarto for executable code blocks and tabbed install sections, and where it lands against Sphinx, MkDocs, and Zensical. Plus, you'll meet Tablin. Here we go.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code talkpython26</a><br> <a href='https://talkpython.fm/temporal'>Temporal</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guests</strong><br/> <strong>Michael Chow</strong>: <a href="https://github.com/machow?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Rich lannone</strong>: <a href="https://github.com/rich-iannone?featured_on=talkpython" target="_blank" >github.com</a><br/> <br/> <strong>Python Web Security with OWASP Top 10 and Agentic AI Course</strong>: <a href="https://talkpython.fm/ai-web-security" target="_blank" >talkpython.fm</a><br/> <br/> <strong>GT</strong>: <a href="https://posit-dev.github.io/great-tables/articles/intro.html?featured_on=talkpython" target="_blank" >posit-dev.github.io</a><br/> <strong>Episode</strong>: <a href="https://talkpython.fm/episodes/show/492/great-tables" target="_blank" >talkpython.fm</a><br/> <strong>Sphinx</strong>: <a href="https://www.sphinx-doc.org/en/master/?featured_on=talkpython" target="_blank" >www.sphinx-doc.org</a><br/> <strong>mkdocs</strong>: <a href="https://www.mkdocs.org/?featured_on=talkpython" target="_blank" >www.mkdocs.org</a><br/> <strong>Zensical</strong>: <a href="https://zensical.org/?featured_on=talkpython" target="_blank" >zensical.org</a><br/> <strong>Hugo</strong>: <a href="https://gohugo.io/?featured_on=talkpython" target="_blank" >gohugo.io</a><br/> <strong>Ghost</strong>: <a href="https://ghost.org/?featured_on=talkpython" target="_blank" >ghost.org</a><br/> <strong>Rs pkgdown</strong>: <a href="https://pkgdown.r-lib.org/?featured_on=talkpython" target="_blank" >pkgdown.r-lib.org</a><br/> <strong>Quarto</strong>: <a href="https://quarto.org/?featured_on=talkpython" target="_blank" >quarto.org</a><br/> <strong>quickstart</strong>: <a href="https://posit-dev.github.io/great-docs/user-guide/quickstart.html?featured_on=talkpython" target="_blank" >posit-dev.github.io</a><br/> <strong>llms.txt file</strong>: <a href="https://llmstxt.org/?featured_on=talkpython" target="_blank" >llmstxt.org</a><br/> <strong>llms.txt</strong>: <a href="https://talkpython.fm/llms.txt" target="_blank" >talkpython.fm</a><br/> <strong>mcp</strong>: <a href="https://talkpython.fm/ai-integration" target="_blank" >talkpython.fm</a><br/> <strong>cli</strong>: <a href="https://talkpython.fm/blog/posts/talk-python-now-has-a-cli/" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=rj2hY2Bsi30" target="_blank" >youtube.com</a><br/> <strong>Episode #549 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/549/great-docs#takeaways-anchor" target="_blank" >talkpython.fm/549</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/549/great-docs" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>
Talk Python Blog
Spanish subtitles available for all courses
Earlier this month, we announed support for multi-lingual subtitles on our courses. You can read the announcement for the full details. Now we are ready to release our second language, Spanish!

All 283 hours of courses have complete Spanish subtitles. Just choose your language, set the subtitle size and location and you have high-quality Spanish subtitles to accompany your learning.
Your next course
What’s next? Well, either drop into your account page and continue with an existing course you’re studying or browse our catalog of courses to find your next one.
Real Python
How to Make a Scatter Plot in Python With plt.scatter()
Visualizing data is a core part of analysis, and Python’s most popular plotting library is Matplotlib. To make a scatter plot, you reach for plt.scatter() from Matplotlib’s pyplot submodule, conventionally aliased as plt. You’ll use it to build both simple two-variable charts and richly customized plots that encode several variables at once.
By the end of this tutorial, you’ll understand that:
- A scatter plot is created by calling
plt.scatter()with two array-like sequences for the x and y values. - Marker size, color, shape, and transparency are controlled by the
s,c,marker, andalphaparameters. plt.scatter()enables per-point customization like variable size or color, whileplt.plot()with marker arguments runs faster for basic plots.- A single scatter plot can represent more than two variables by mapping extra dimensions to marker properties.
- Matplotlib’s plot styles, listed in
plt.style.available, are applied withplt.style.use().
To get the most out of this tutorial, you should be familiar with the fundamentals of Python programming and the basics of NumPy and its ndarray object. You don’t need to be familiar with Matplotlib to follow this tutorial, but if you’d like to learn more about the module, then check out Python Plotting With Matplotlib (Guide).
Get Your Code: Click here to download the free sample code you’ll use to build customized scatter plots in Python with plt.scatter().
Take the Quiz: Test your knowledge with our interactive “How to Make a Scatter Plot in Python With plt.scatter()” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
How to Make a Scatter Plot in Python With plt.scatter()Practice using plt.scatter() in Python to create scatter plots and encode multiple variables with marker size, color, shape, and transparency.
How to Make a Scatter Plot in Python
A scatter plot is a visual representation of how two variables relate to each other. You can use scatter plots to explore the relationship between two variables, for example by looking for any correlation between them.
In this section of the tutorial, you’ll become familiar with creating basic scatter plots using Matplotlib. In later sections, you’ll learn how to further customize your plots to represent more complex data using more than two dimensions.
Getting Started With plt.scatter()
Before you can start working with plt.scatter(), you’ll need to install Matplotlib. You can do so using Python’s standard package manager, pip, by running the following command in the console:
$ python -m pip install matplotlib
Now that you have Matplotlib installed, consider the following use case. A café sells six different types of bottled orange drinks. The owner wants to understand the relationship between the price of the drinks and his daily sales, so he keeps track of how many of each drink he sells every day. You can visualize this relationship as follows:
import matplotlib.pyplot as plt
price = [2.50, 1.23, 4.02, 3.25, 5.00, 4.40]
sales_per_day = [34, 62, 49, 22, 13, 19]
plt.scatter(price, sales_per_day)
plt.show()
In this Python script, you import the pyplot submodule from Matplotlib using the alias plt. This alias is generally used by convention to shorten the module and submodule names. You then create lists with the price and average sales per day for each of the six orange drinks sold.
Finally, you create the scatter plot by using plt.scatter() with the two variables you wish to compare as input arguments. As you’re using a Python script, you also need to explicitly display the figure by using plt.show().
When you’re using an interactive environment, such as a console or a Jupyter Notebook, you don’t need to call plt.show(). All examples in this tutorial are scripts and include the call to plt.show().
Here’s the output from this code:
This plot shows that, in general, the more expensive a drink is, the fewer items are sold. However, the drink that costs $4.02 is an outlier, suggesting that it’s a particularly popular product. When using scatter plots in this way, close inspection can help you explore the relationship between variables. You can then carry out further analysis, whether it’s using linear regression or other techniques.
Comparing plt.scatter() and plt.plot()
You can also produce the scatter plot shown above using another function within matplotlib.pyplot. Matplotlib’s plt.plot() is a general-purpose plotting function that will allow you to create various line or marker plots.
You can achieve the same scatter plot as the one you obtained in the section above with the following call to plt.plot(), using the same data:
plt.plot(price, sales_per_day, "o")
plt.show()
In this case, you had to include the marker "o" as a third argument because otherwise, plt.plot() would plot a line graph. The plot you created with this code is identical to the plot you created earlier with plt.scatter().
Read the full article at https://realpython.com/visualizing-python-plt-scatter/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python Bytes
#481 Ways to die
<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://nesbitt.io/2026/05/19/dumb-ways-for-an-open-source-project-to-die.html?featured_on=pythonbytes">Dumb Ways for an Open Source Project to Die</a></strong></li> <li><strong><a href="https://pydevtools.com/handbook/how-to/how-to-create-a-pylock-toml-lockfile/?featured_on=pythonbytes">How to create a pylock.toml lockfile</a></strong></li> <li><strong>https://github.com/facebook/Lifeguard</strong></li> <li><strong><a href="https://www.dash0.com/guides/python-logging-libraries?featured_on=pythonbytes">Choosing a Python Logging Library in 2026</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=r66j2SAHQFs' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="481">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky)</li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 11am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1: <a href="https://nesbitt.io/2026/05/19/dumb-ways-for-an-open-source-project-to-die.html?featured_on=pythonbytes">Dumb Ways for an Open Source Project to Die</a></strong></p> <ul> <li>Core categories <ul> <li><strong>The maintainer left</strong></li> <li><strong>The maintainer is still there</strong></li> <li><strong>Sabotage and capture</strong></li> <li><strong>The release pipeline broke</strong></li> <li><strong>Force majeure</strong></li> <li><strong>The world moved on</strong></li> <li><strong>The project split</strong></li> - </ul></li> <li>Examples <ul> <li><a href="https://github.com/jgthms/bulma?featured_on=pythonbytes">Bulma</a> PRs still from 2023, issues and PRs with no maintainer response for years, last release 1.5 years ago</li> <li><a href="https://github.com/grantjenks/python-diskcache?featured_on=pythonbytes">diskcache</a> Similar, got hired by OpenAI, crickets after that</li> </ul></li> </ul> <p><strong>Brian #2: <a href="https://pydevtools.com/handbook/how-to/how-to-create-a-pylock-toml-lockfile/?featured_on=pythonbytes">How to create a pylock.toml lockfile</a></strong></p> <ul> <li>Tim Hopper</li> <li>Tim walks through using <code>uv</code>, <code>pip</code> and <code>pdm</code> to create <code>pylock.toml</code> files.</li> <li>Recommendation: use <code>uv export --format pylock.toml -o pylock.toml</code></li> <li>He also has <a href="https://pydevtools.com/handbook/how-to/how-to-install-from-a-pylock-toml-lockfile-with-pip/?featured_on=pythonbytes">How to install from a pylock.toml lockfile with pip</a> but the short version is: <ul> <li>use <code>-r</code> because tools treat it like a requirements file</li> </ul></li> </ul> <p><strong>Michael #3:</strong> https://github.com/facebook/Lifeguard</p> <ul> <li>Lifeguard is a static analyzer to detect Lazy Imports incompatibilities and ease the adoption overhead for Lazy Imports in Python.</li> <li>I’m more excited about lazy imports after my <a href="https://mkennedy.codes/posts/cutting-python-web-app-memory-over-31-percent/?featured_on=pythonbytes">Cutting Python Web App Memory Over 31%</a> experience</li> <li>Some Python patterns depend on imports executing immediately. For example: <ul> <li><strong>Module-level side effects</strong> — a module that registers a handler or modifies global state at import time will behave differently if that import is deferred.</li> <li><strong>The registry pattern</strong> — a module that registers itself (e.g., adding to a global dict) when imported will silently fail to register under Lazy Imports.</li> <li><strong><code>sys.modules</code> manipulation</strong> — code that reads or writes <code>sys.modules</code> assumes prior imports have already executed.</li> <li><strong>Metaclasses and <code>__init_subclass__</code></strong> — class creation side effects may depend on imports being resolved.</li> </ul></li> <li><strong>Project Stage: Beta</strong> Lifeguard is in active development. We are aiming to be ready for general use by the <a href="https://peps.python.org/pep-0790/?featured_on=pythonbytes">Python 3.15 final release</a>.</li> </ul> <p><strong>Brian #4: <a href="https://www.dash0.com/guides/python-logging-libraries?featured_on=pythonbytes">Choosing a Python Logging Library in 2026</a></strong></p> <ul> <li>Ayooluwa Isaiah</li> <li>" which libraries matter, how they compare, where they overlap with the standard module, and when each one makes sense.”</li> <li>The slant with this article is the need to log json output, which seems reasonable as things like API entry and exit point logging will include json.</li> <li>Covered libraries <ul> <li>standard library <code>logging</code> with a hat tip to <a href="https://nhairs.github.io/python-json-logger/latest/?featured_on=pythonbytes">python-json-logger</a> <ul> <li>Same site has a <a href="https://www.dash0.com/guides/python-json-logger?featured_on=pythonbytes">guide to setting up python-json-logger</a></li> </ul></li> <li><a href="https://www.structlog.org?featured_on=pythonbytes">structlog</a></li> <li><a href="https://loguru.readthedocs.io?featured_on=pythonbytes">Loguru</a></li> <li><a href="https://logbook.readthedocs.io/en/stable/?featured_on=pythonbytes">Logbook</a></li> <li><a href="https://microsoft.github.io/picologging/?featured_on=pythonbytes">picologging</a></li> </ul></li> <li>Some benchmarks with structlog, stdlib+json, and Loguru, with structlog coming out faster</li> <li>I liked the Loguru example <ul> <li>I’m going to have to try <code>@logger.catch</code> and <code>logger.exception()</code> for easily logging exceptions and <code>serialize=True</code> to enable JSON output.</li> </ul></li> </ul> <p><strong>Extras</strong></p> <p>Brian:</p> <ul> <li><a href="https://www.npr.org/sections/money/2014/10/21/357629765/when-women-stopped-coding?featured_on=pythonbytes">When Women Stopped Coding</a> - Planet Money segment , spotted on BlueSky from <a href="https://bsky.app/profile/savannah.dev/post/3mml3emj63k22?featured_on=pythonbytes">Savannah Ostrowski</a></li> <li><a href="https://courses.pythontest.com/lean-tdd/?featured_on=pythonbytes">Lean TDD</a> is now leaner <ul> <li>Still working on audio version, but some great changes in 0.7.1 version <ul> <li>Ch 6, <strong>TDD Interpretations</strong>, move ATDD and some of BDD to chapter</li> <li>Ch 7, Change name to <strong>TDD with Teams: BDD and ATDD</strong></li> <li>Ch 9, <strong>Lean TDD</strong>, streamline steps and chapter</li> <li>Ch 10, Change name to <strong>Lean TDD with Teams: Lean ATDD</strong></li> <li>Ch 11, <strong>Lean</strong> <strong>TDD with AI</strong>, Add short discussion about guardrails and security</li> </ul></li> </ul></li> </ul> <p>Michael:</p> <ul> <li>New course: <a href="https://training.talkpython.fm/courses/agentic-ai-python-security?featured_on=pythonbytes">Python Web Security: OWASP Top 10 with Agentic AI</a></li> <li>All courses now with Spanish subtitles, <a href="https://talkpython.fm/blog/posts/spanish-subtitles-available-for-all-courses/?featured_on=pythonbytes">see announcement</a></li> </ul> <p><strong>Joke: <a href="https://x.com/pr0grammerhum0r/status/2057733228899823981?s=12&featured_on=pythonbytes">Stop texting me</a></strong></p>
Graham Dumpleton
Free-threading in mod_wsgi 6.0.0
The previous post in this series covered the new WSGIPerInterpreterGIL directive in mod_wsgi 6.0.0 and the PEP 684 per-interpreter GIL feature that landed in Python 3.12. This post is about its sibling, WSGIFreeThreading, which targets PEP 703 free-threaded Python builds.
The two directives sit next to each other in the mod_wsgi configuration vocabulary and they both opt processes into a non-default concurrency model, but the underlying mechanisms are quite different. Per-interpreter GIL gives each sub-interpreter its own lock. Free-threading removes the lock entirely. That distinction shapes everything below.
What free-threading actually is
Free-threading removes the GIL from CPython entirely. There is no process-wide lock to acquire and no per-interpreter lock to acquire. All threads in the process can run Python bytecode in parallel, in the same interpreter, against the same Python objects. This is fundamentally different from per-interpreter GIL, which keeps a GIL but gives each sub-interpreter its own one. Free-threading has no GIL at all.
The price for this is a special CPython build. The feature is enabled at compile time with --disable-gil, and on platforms that distribute it the resulting binary is typically named python3.13t. The "t" suffix exists precisely so the free-threaded build can coexist on a system alongside the normal CPython build. Free-threading shipped as an experimental opt-in in Python 3.13 and continues to mature in 3.14.
One useful detail to know is that a free-threaded build can still run with a GIL. The build supports both modes. What you get at runtime depends on what the embedder asks for. Which is the bridge into mod_wsgi's posture.
mod_wsgi's posture: opt-in even on a free-threaded build
If you compile and install mod_wsgi against a free-threaded Python, the default is still GIL-enabled. Nothing about your existing application behaviour changes until you say otherwise. The free-threaded build supports the mode; mod_wsgi declines to use it without explicit instruction.
This is worth being clear about because the assumption most people will reach for is the opposite. Installing mod_wsgi against python3.13t does not automatically give you free-threading. It gives you the ability to opt in.
The reason for the default is the one you can guess at. The ecosystem of C extensions is nowhere near ready for everyone to be on free-threading by default. Forcing it on across the board would silently break existing deployments the moment they happened to import an extension that has not been audited for thread-safe execution. Defaulting to GIL-enabled keeps the worst case "nothing changes". You only get the new behaviour when you ask for it.
The opt-in is WSGIFreeThreading On. The directive is per process. Unlike WSGIPerInterpreterGIL, it cannot be scoped to a specific sub-interpreter inside a process. Free-threading is a property of the whole process or none of it.
The combinatorial story
The upside of keeping the default opt-in is the flexibility it leaves you with. Compile mod_wsgi against a free-threaded Python build and you have access to three different concurrency models, and you can mix them across daemon process groups within the same Apache instance.
The three options:
- Process-wide GIL (the classic model, still the default)
- Per-interpreter GIL, where each sub-interpreter in a process holds its own GIL (covered in the previous post)
- Free-threading, where the process has no GIL at all
A single Apache instance can have one daemon process group running free-threaded for a CPU-bound numerical workload that has been audited end-to-end, another running with per-interpreter GIL for an application whose extensions support PEP 684 but not PEP 703, and embedded mode left on the classic process-wide GIL. Pick the right model per workload.
There is also an experimentation angle worth calling out. Comparing the behaviour of the same application under each of the three modes on the same machine is suddenly much easier. You can run the same WSGI application in three daemon process groups, configure each one differently, route a slice of traffic at each, and compare directly.
The constraint to be aware of: within a single process, free-threading and per-interpreter GIL are mutually exclusive. If both apply to the same process, free-threading wins and the per-interpreter GIL setting becomes a no-op. The mix-and-match is across processes, not inside one.
How to enable
The simplest form, opting all processes into free-threading at server scope:
WSGIFreeThreading On
Selective opt-in for a specific daemon process group, using the WSGIInterpreterOptions container directive introduced in the previous post:
<WSGIInterpreterOptions process-group="cpu-bound">
WSGIFreeThreading On
</WSGIInterpreterOptions>
And for the embedded mode interpreter in Apache child processes:
<WSGIInterpreterOptions process-group="%{GLOBAL}">
WSGIFreeThreading On
</WSGIInterpreterOptions>
mod_wsgi-express has a convenience flag, --free-threading, that flips this on for its generated configuration.
One important contrast with WSGIPerInterpreterGIL to make explicit. The application-group= selector is not meaningful for free-threading. Per-interpreter GIL is a property of an individual sub-interpreter, so it makes sense to scope down to one. Free-threading is a property of the process. You cannot opt one sub-interpreter inside a process into free-threading while leaving another sub-interpreter in the same process with a GIL. The granularity is the process. If you write a <WSGIInterpreterOptions> container with an application-group= selector and try to put WSGIFreeThreading inside it, mod_wsgi will ignore the setting and log a warning.
What this means for your Python code
In theory, a correctly written WSGI application is already thread-safe. The WSGI specification has always allowed servers to call the application from multiple threads concurrently, and mod_wsgi has been able to use threaded daemon processes for years. So strictly speaking, if you have been doing it right, you are most of the way there.
In practice, an enormous amount of WSGI code is implicitly relying on what the GIL gives you for free, in a way most developers do not even realise they are relying on. The GIL ensures that bytecode-level operations serialise against each other. Patterns like incrementing a counter with counter += 1, setting a key in a shared dict with cache[key] = value, appending to a shared list with items.append(thing), or "check then set" lookups against shared state happen to be safe-ish under the GIL because the GIL boundary makes them effectively atomic in the cases that matter most. Without a GIL they are not atomic. They need explicit locks or genuinely atomic primitives.
The shapes of code that are most likely to be quietly unsafe under free-threading are not exotic. Module-level mutable state (registries, caches, in-memory counters) is the most common pattern. Lazy initialisation without locks (if _thing is None: _thing = build()) shows up everywhere. Shared mutable objects passed around between threads via globals, memoisation decorators that mutate shared dicts, application singletons set up at import time, the list goes on. These patterns are pervasive in real applications and they are exactly the kind of thing that "has always worked fine under a threaded server" because the GIL has been silently saving them.
This is not a mod_wsgi-specific concern. It is the general PEP 703 question that every application owner has to answer for themselves, every library author has to answer for their library, and that the ecosystem as a whole is going to spend years working through. But mod_wsgi is going to be one of the most realistic places to actually run free-threaded Python against a real workload, so it is likely to be where a lot of these latent bugs first surface.
The defensible position. If your application has been deliberately audited for true concurrent execution, with real locks where shared mutable state is touched and no implicit reliance on the GIL for serialisation, you are most of the way there. Most code, including most mature Python libraries, has not been audited that way. Free-threading is not a trap, but it is genuinely a different correctness contract than the one most Python code was written against. Treat the opt-in accordingly.
What this means for C extensions
The previous post covered the C extension story for per-interpreter GIL in some detail. The rules for free-threading are a separate set of rules, related but distinct, and I will focus on the contrasts rather than restate the bits that overlap.
An extension opts into free-threading by declaring Py_mod_gil = Py_MOD_GIL_NOT_USED in its PyModuleDef_Slot array. That declaration is the extension author asserting "I have been audited for execution without a GIL". Without it, CPython treats the extension as untrusted for free-threading.
The interesting difference from per-interpreter GIL is the load behaviour. Per-interpreter GIL fails the import outright if an extension has not declared support. Free-threading does not. The extension loads, but as soon as it loads CPython silently re-enables the GIL for the entire process and emits a runtime warning. That is worth understanding because the failure mode is "your free-threading quietly turned off" rather than "your import broke". You may not notice for a while that everything is back to running under a GIL.
The other requirements largely match the per-interpreter GIL story. PEP 489 multi-phase module initialisation is the prerequisite. Module-level static state in C becomes a data-race risk in a way it was not under the GIL, and the right answer is to move it into module state retrieved via PyModule_GetState, with proper locking applied where shared state is unavoidable. Code still using the simplified PyGILState API needs to be reviewed for its assumptions, though for different reasons than under per-interpreter GIL.
For operators, the auditing message is the same as last time. Before turning WSGIFreeThreading on in any kind of production setting, work through every C extension your application pulls in, directly and transitively, and check whether each one declares free-threading support. An extension that loads under free-threading without complaint is not necessarily fine. It may just be the one that triggered the silent fallback to GIL-enabled.
Which applications actually benefit
CPU-bound Python work that can be parallelised across threads in a single process is the clear win. Two threads inside one free-threaded process can both run Python bytecode at full speed against the same objects in the same address space. There is no within-sub-interpreter serialisation caveat to qualify it with, in contrast to per-interpreter GIL where two requests in the same sub-interpreter still compete for that sub-interpreter's GIL. Under free-threading, there is no GIL to compete for.
There are costs to be honest about. Free-threaded CPython carries a measurable single-threaded overhead compared with a normal CPython build, because the runtime has to do per-thread bookkeeping for object reference counts and various other things that the GIL was previously making free. The single-thread performance gap has been closing release-over-release, but it is still real, and the trade is parallel throughput for single-thread speed. If your workload does not have parallel Python execution to gain in the first place, enabling free-threading can leave you slower overall.
For ordinary I/O-bound WSGI applications, the practical gain remains smaller for the same reasons as in the previous post. I/O already releases the GIL on a normal CPython build, threads in a single process already overlap their waits on databases and network, and adding daemon processes remains the simpler scaling lever for most web workloads. Free-threading is most interesting where you specifically have CPU-bound Python that would benefit from running concurrently inside one process, and where you can afford both the audit work and the per-thread overhead.
What's next
If you run mod_wsgi and the free-threading story is interesting to you, please install the 6.0.0 release candidate against a free-threaded Python build, try it against a real workload, and file issues against the GitHub project for anything that breaks or behaves oddly. Free-threading is genuinely new territory for embedded Python hosts, and the feedback from real deployments is what will catch the rough edges before the final release.
The next post in this concurrency series will cover WSGISwitchInterval. That one is not another GIL mode; it is a tuning lever for adjusting how frequently the GIL is yielded between threads, which can help reduce GIL contention in some workloads. It only applies where there is a GIL to switch, so it is a no-op under free-threading.
For reference:
- mod_wsgi documentation
- mod_wsgi 6.0.0 release notes
- Per-interpreter GIL and free-threading user guide
WSGIFreeThreadingdirective documentation- PEP 703: Making the Global Interpreter Lock Optional in CPython
- PEP 489: Multi-phase extension module initialization
- Previous post: Per-interpreter GIL in mod_wsgi 6.0.0
May 24, 2026
Kay Hayen
Nuitka Release 4.1
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.
Bug Fixes
Python 3.14: Fix, decorators were breaking when disabling deferred annotations. (Fixed in 4.0.1 already.)
Fix, nested loops could have wrong traces lead to mis-optimization. (Fixed in 4.0.1 already.)
Plugins: Fix, run-time check of package configuration was incorrect. (Fixed in 4.0.1 already.)
Compatibility: Fix,
__builtins__lacked necessary compatibility in compiled functions. (Fixed in 4.0.1 already.)Distutils: Fix, incorrect UTF-8 decoding was used for TOML input file parsing. (Fixed in 4.0.1 already.)
Fix, multiple hard value assignments could cause compile time crashes. (Fixed in 4.0.1 already.)
Fix, string concatenation was not properly annotating exception exits. (Fixed in 4.0.2 already.)
Windows: Fix,
--verbose-outputand--show-modules-outputdid not work with forward slashes. (Fixed in 4.0.2 already.)Python 3.14: Fix, there were various compatibility issues including dictionary watchers and inline values. (Fixed in 4.0.2 already.)
Python 3.14: Fix, stack pointer initialization to
localspluswas incorrect to avoid garbage collection issues. (Fixed in 4.0.2 already.)Python 3.12+: Fix, generic type variable scoping in classes was incorrect. (Fixed in 4.0.2 already.)
Python 3.12+: Fix, there were various issues with function generics. (Fixed in 4.0.2 already.)
Python 3.8+: Fix, names in named expressions were not mangled. (Fixed in 4.0.2 already.)
Plugins: Fix, module checksums were not robust against quoting style of module-name entry in YAML configurations. (Fixed in 4.0.2 already.)
Plugins: Fix, doing imports in queried expressions caused corruption. (Fixed in 4.0.2 already.)
UI: Fix, support for
uv_buildin the--projectoption was broken. (Fixed in 4.0.2 already.)Compatibility: Fix, names assigned in assignment expressions were not mangled. (Fixed in 4.0.2 already.)
Python 3.12+: Fix, there were still various issues with function generics. (Fixed in 4.0.3 already.)
Clang: Fix, debug mode was disabled for clang generally, but only ClangCL and macOS Clang didn’t want it. (Fixed in 4.0.3 already.)
Zig: Fix,
--windows-console-mode=attach|disablewas not working when using Zig. (Fixed in 4.0.3 already.)macOS: Fix, yet another way self dependencies can look like, needed to have support added. (Fixed in 4.0.3 already.)
Python 3.12+: Fix, generic types in classes had bugs with multiple type variables. (Fixed in 4.0.3 already.)
Scons: Fix, repeated builds were not producing binary identical results. (Fixed in 4.0.3 already.)
Scons: Fix, compiling with newer Python versions did not fall back to Zig when the developer prompt MSVC was unusable, and error reporting could crash. (Fixed in 4.0.4 already.)
Zig: Fix, the workaround for Windows console mode
attachordisablewas incorrectly applied on non-Windows platforms. (Fixed in 4.0.4 already.)Standalone: Fix, linking with Python Build Standalone failed because
libHacl_Hash_SHA2was not filtered out unconditionally. (Fixed in 4.0.4 already.)Python 3.6+: Fix, exceptions like
CancelledErrorthrown into an async generator awaiting an inner awaitable could be swallowed, causing crashes. (Fixed in 4.0.4 already.)Fix, not all ordered set modules accepted generators for update. (Fixed in 4.0.5 already.)
Plugins: Disabled warning about rebuilding the
pytokensextension module. (Fixed in 4.0.5 already.)Standalone: Filtered
libHacl_Hash_SHA2from link libs unconditionally. (Fixed in 4.0.5 already.)Debugging: Disabled unusable unicode consistency checks for Python versions 3.4 to 3.6. (Fixed in 4.0.5 already.)
Python3.12+ Avoided cloning call nodes on class level which caused issues with generic functions in combination with decorators. (Added in 4.0.5 already.)
Python 3.12+: Added support for generic type variables in
async deffunctions. (Added in 4.0.5 already.)UI: Fix, flushing outputs for prompts was not working in all cases when progress bars were enabled. (Fixed in 4.0.6 already.)
UI: Fix, unused variable warnings were missing at C compile time when using
zigas a C compiler. (Fixed in 4.0.6 already.)Scons: Fix, forced stdout and stderr paths as a feature was broken. (Fixed in 4.0.6 already.)
Fix, replacing a branch did not accurately track shared active variables causing optimization crashes. (Fixed in 4.0.7 already.)
macOS: Fix, failed to remove extended attributes because files need to be made writable first. (Fixed in 4.0.7 already.)
Fix, dict
popandsetdefaultusing with:=rewrites lacked exception-exit annotations for un-hashable keys. (Fixed in 4.0.8 already.)Python 3.13: Fix, the
__parameters__attribute of generic classes was not working. (Fixed in 4.0.8 already.)Python 3.11+: Fix, starred arguments were not working as type variables. (Fixed in 4.0.8 already.)
Python2: Fix,
FileNotFoundErrorcompatibility fallback handling was not working properly. (Fixed in 4.0.8 already.)Compatibility: Fix, loop ownership check in value traces was missing, causing issues with nested loops.
Windows: Improved
--windows-console-mode=attachto properly handle console handles, enabling cases likeos.systemto work nicely.Python2: Fix, there was a compatibility issue where providing default values to the
mkdtempfunction was failing.Windows: Fix, there were spurious issues with C23 embedding in 32-bit MinGW64 by switching to
coff_objresource mode for it as well.Plugins: Fix, the
post-import-codeexecution could fail because the triggering sub-package was not yet available insys.modules.UI: Fix, listing package DLLs with
--list-package-dllswas broken due to recent plugin lifecycle changes.UI: Fix,
--list-package-exewas not working properly on non-Windows platforms failing to detect executable files correctly.UI: Handled paths starting with
{PROGRAM_DIR}the same as a relative path when parsing the--onefile-tempdir-specoption.Plugins: Followed multiprocessing
forkserverchanges for newer Python versions.Python 3.12+: Fix, generic class type parameters handling was incorrect.
Python 3.12: Fix, deferred evaluation of type aliases was failing.
Python 3.12+: Aligned
sumbuilt-in float summation with CPython’s compensated sum for better accuracy.Python 3.10+: Fix, uncompiled coroutine
throw()return handling was incorrect, restoring completed coroutine results viaStopIteration.valuerather than exposing them as ordinary return values to the outer await chain.Python 3.13+: Fix, uncompiled coroutine
cancel()/awaitsuspension handling was incorrect, improved to ensure integration compatibility.macOS: Made finding
create-dmgmore robustly by also checking the Homebrew path for Intel and fromPATHproperly.Compatibility: Fix, class frames were not exposing frame locals.
UI: Detected
static-libpythonproblems, which affected some forms of Anaconda.Distutils: Rejected
--projectmixed with--mainarguments as it is not useful.macOS: Fix,
zigfromPATHor fromziglangwas not being used.Distutils: Fix, the wrong
module-rootconfig value was being checked foruvbuild backend.macOS: Fix, was attempting to change removed (rejected) DLLs, which of course failed and errored out.
Python 3.14: Fix, tuple reuse was not fully compatible, potentially causing crashes due to outdated hash caches.
Fix, fake modules were still being attempted to located when imported by other code, which could conflict with existing modules.
Python 3.5+: Fix, failed to send uncompiled coroutines the sent in value in
yield from.Fix, older
gcccompilers lacking newer intrinsic methods had compilation issues that needed to be addressed.Standalone: Fix, multiphase module extension modules with post-load code were not working properly.
Fix, Avoid using the non-inline copy of
pkg_resourceswith the inline copy of Jinja2. These could mismatch and cause errors.Fix, loops could make releasing of previous values very unclear, causing optimization errors.
Fix,
incbinresource mode was not working with oldgccC++ fallback.Python 3.4 to 3.6: Fix, bytecode demotion was not working properly for these versions, also bytecode only files not working.
Plugins: Added a check for the broken
patchelfversions 0.10 and 0.11 to prevent breaking Qt plugins.Android: Allowed
patchelfversion 0.18 on Android.Windows: Fix, the header path for self uninstalled Python was not detected correctly.
Release: Fix, inclusion of the
pkg_resourcesinline copy for Python 2 to source distributions was missing.UI: Detected the OBS versions of SUSE Linux better.
Suse: Allowed using
patchelf0.18.0 there too.Python 3.11: Fix, package and module dicts were not aligned close enough to avoid a CPython bug.
Fix, unbound compiled methods could crash when called without an object passed.
Standalone: Fix, multiphase module extension modules with postload. (Fixed in 4.0.8 already.)
Onefile: Fix, while waiting for the child, it may already be terminated.
macOS: Removed existing absolute rpaths for Homebrew and MacPorts.
Python 3.14: Avoided warning in CPython headers.
Python 3.14: Followed allocator changes more closely.
Compatibility: Avoided using
pkg_resourcesfor Jinja2 template location for loading.No-GIL: Applied some bug fixes to get basic things to work.
Package Support
Standalone: Add support for newer
paddleversion. (Added in 4.0.1 already.)Standalone: Add workaround for refcount checks of
pandas. (Fixed in 4.0.1 already.)Standalone: Add support for newer
h5pyversion. (Added in 4.0.2 already.)Standalone: Add support for newer
scipypackage. (Added in 4.0.2 already.)Plugins: Revert accidental
os.getenvoveros.environ.getchanges in anti-bloat configurations that stopped them from working. Affected packages arenetworkx,persistent, andtensorflow. (Fixed in 4.0.5 already.)Standalone: Added missing DLLs for
openvino. (Added in 4.0.7 already.)Enhanced the package configuration YAML schema by adding the
relative_toparameter forfrom_filenamesDLL specification, avoiding error-prone purely relative paths.Standalone: Fix,
flet_desktopapp assets were missing, now preserving the packaged runtime and sidecar DLLs.Standalone: Added support for the
tyropackage.Standalone: Added data files for the
perfettopackage.Standalone: Added support for
anyioprocess forking.Standalone: Added support for the
plotly.graphpackage.Anaconda: Fix, dependencies for the
numpyconda package on Windows were incorrect.Plugins: Enhanced the auto-icon hack in PySide6 to use compatible class names.
Standalone: Fix, Qt libraries were duplicated with
PySide6WebEngine framework support on macOS.Plugins: Fix, automatic detection of
mypycruntime dependencies was including all top level modules of the containing package by accident. (Fixed in 4.0.5 already.)Anaconda: Fix,
delvewheelplugin was not working with Python 3.8+. This enhances compatibility with installed PyPI packages that use it for their DLLs. (Fixed in 4.0.6 already.)Plugins: Fix, our protection workaround could confuse methods used with
PySide6.
New Features
UI: Added the
--recommended-python-versionoption to display recommended Python versions for supported, working, or commercial usage.UI: Add message to inform users about
Nuitka[onefile]if compression is not installed. (Added in 4.0.1 already.)UI: Add support for
uv_buildin the--projectoption. (Added in 4.0.1 already.)Onefile: Allow extra includes as well. (Added in 4.0.2 already.)
UI: Add
nuitka-project-setfeature to define project variables, checking for collisions with reserved runtime variables. (Added in 4.0.2 already.)Scons: Added new option to select
--reproduciblebuilds or not. (Added in 4.0.6 already.)Python 3.10+: Added support for
importlib.metadata.package_distributions(). (Added in 4.0.8 already.)Plugins: Added support for the multiprocessing
forkservercontext. (Added in 4.0.8 already, for 4.1 Python 3.6 and earlier, as well as 3.14 support were added too.)Reports: Added structured resource usage (
rusage) performance information to compilation reports.Reports: Included individual module-level C compiler caching (
ccache/clcache) statistics in compilation reports.Added support for detecting and correctly resolving the Python prefix for the
PyEnv on HomebrewPython flavor.macOS: Added support for
rusageinformation for Scons.UI: Added the
__compiled__.extension_filenameattribute to give the real filename of the containing extension module.Windows: Added support for
--clangor ARM. (Added in 4.0.8 already.)Windows: Added support for resources names as not just integers, important when we copy them from template files.
MacPorts: Added basic support for this Python flavor. More work will be needed to get it to work fully though.
Optimization
Avoid including
importlib._bootstrapandimportlib._bootstrap_external. (Added in 4.0.1 already.)Linux: Cached the
syscallused for time keeping during compilation to avoid loadinglibcfor each trace. (Added in 4.0.8 already.)UI: Output a warning for modules that remain unfinished after the third optimization pass.
Added an extra micro pass trigger when new variables are introduced or variable usage changes severely, ensuring optimizations are fully propagated, avoiding unnecessary extra full passes.
Provided scripts to compile Python statically with PGO tailored for Nuitka on Linux, Windows, and macOS.
Added support for running the Data Composer tool from a compiled Nuitka binary without spawning an uncompiled Python process.
Enhanced the usage of
vectorcallforPyCFunctionobjects by directly checking for its presence instead of relying purely on flags, allowing more frequent use of this faster execution path.Cached frequently used declarations for top-level variables to speed up C code generation.
Sped up trace collection merging by avoiding unnecessary set creation and using a set instead of a list for escaped traces.
Optimized plugin hook execution by tracking overloaded methods and added an option to show plugin usage statistics.
Improved performance of module location by avoiding unnecessary module name reconstruction and redundant filesystem checks for pre-loaded packages.
Improved the caching of distribution name lookups to effectively avoid repeated IO operations across all package types.
Plugins: Cached callback plugin dispatch for
onFunctionBodyParsingandonClassBodyParsingto skip argument computation when no plugin overrides them.Python 3.13: Handled sub-packages of
pathlibas hard modules.Handled hard attributes through merge traces as well.
Made constant blobs more compact by avoiding repeated identifiers and unnecessary fields.
Enhanced Python compilation scripts further. (Fixed in 4.0.8 already.)
Recognized late incomplete variables better. (Fixed in 4.0.8 already.)
Made constant blobs more compact. (Fixed in 4.0.8 already.)
Optimized calls with only constant keywords and variable posargs too.
Anti-Bloat
Fix, memory bloat occurred when C compiling
sqlalchemy. (Fixed in 4.0.2 already.)Avoid using
pydocinPySimpleGUI. (Added in 4.0.2 already.)Avoided using
doctestfromzodbpickle. (Added in 4.0.5 already.)Avoided inclusion of
cythonwhen usingpyav. (Added in 4.0.7 already.)Avoided including
typing_extensionswhen usingnumpy. (Added in 4.0.7 already.)
Organizational
UI: Relocated the warning about the available source code of extension modules to be evaluated at a more appropriate time.
Debian: Remove recommendation for
libfuse2package as it is no longer useful.Debian: Used
platformdirsinstead ofappdirs.Debugging: Removed Python 3.11+ restriction for
clang-formatas it is available everywhere, even Python 2.7, and we still want nicely formatted code when we read things. (Added in 4.0.6 already.)Removed no longer useful inline copy of
wax_off. We have our own stubs generator project.Release: Added missing package to the CI container for building Nuitka Debian packages.
Developer: Updated AI instructions for creating Minimal Reproducible Examples (MRE) to skip unneeded C compilation.
Debugging: Added an internal function for checking if a string is a valid Python identifier.
AI: Added a task in Visual Studio Code to export the currently selected Python interpreter path to a file, making it available as “python” and “pip” matching the selected interpreter. This makes it easier to use a specific version with no instructions needed.
AI: Updated the rules to instruct AI to only generate useful comments that add context not present in the code.
Containers: Added template rendering support for Jinja2 (
.j2) container files in our internal Podman tools.Projects: Clarified the current status and rationale of Python 2.6 support in the developer manual.
Debugging: Added experimental flag
--experimental=ignore-extra-micro-passto allow ignoring extra micro pass detection.Visual Code: Added integration scripts for
bashandzshautocompletion of Nuitka CLI options. These are now also integrated into Visual Studio Code terminal profiles and the Debian package.RPM: Included the Python compile script for Linux.
RPM: Removed the requirement for
distutilsin the spec.
Tests
Install only necessary build tools for test cases.
Avoided spurious failures in reference counting tests due to Python internal caching differences. (Fixed in 4.0.3 already.)
Fix, the parsing of the compilation report for reflected tests was incorrect.
Python 3.14: Ignored a syntax error message change.
Python 3.14: Added test execution support options to the main test runner to use this version as well.
Fix, the runner binary path was mishandled for the third pass of reflected compilations.
Removed the usage of obsolete plugins in reflected compilation tests.
Debugging: Prevented boolean testing of
namedtuplesto avoid unexpected bugs.Added the
Testsuffix to syntax test files and disabled “python” mode and spell checking for them to resolve issues reported in IDEs.Fix, newline handling in diff outputs from the output comparison tool was incorrect.
Covered
post-import-codefunctionality with a new subpackage test case.Prevented the program test suite from running an unnecessary variant to save execution time.
macOS: Ignored differences from GUI framework error traces in headless runs in output comparisons.
Reflected test for Nuitka, where it compiles itself and compares its operation has been restored to functional state.
Used the new method to clear internal caches if available for reference counts.
Disabled running nested loops test with Python 2.6.
Containers: Detected Python 2 defaulting containers in Podman tooling.
Cleanups
UI: Fix, there was a double space in the Windows Runtime DLLs inclusion message. (Fixed in 4.0.1 already.)
Onefile: Separated files and defines for extra includes for onefile boot and Python build.
Scons: Provided nicer errors in case of “unset” variables being used, so we can tell it.
Refactored the process execution results to correctly utilize our
namedtuplesvariant, that makes it easier to understand what code does with the results.Quality: Enabled automatic conversion of em-dashes and en-dashes in code comments to the autoformat tool. AI won’t stop producing them and they can cause
SyntaxErrorfor older Python versions, nor is unnecessarily using UTF-8 welcome.Ensured that cloned outline nodes are assigned their correct names immediately upon creation, that avoids inconsistencies during their creation.
Quality: Updated to the latest versions of
blackand adopted a fasterisortexecution by caching results.Quality: Modified the PyLint wrapper to exit gracefully instead of raising an error when no matching files require checking.
Quality: Avoided checking YAML package configuration files twice, since autoformat already handles them.
Quality: Ensured that YAML package configuration checks output the original filename instead of the temporary one when a failure occurs.
Quality: Prevented pushing of tags from triggering git pre-push quality checks.
Quality: Silenced the output of
optipngandjpegoptimduring image optimization auto-formatting.Visual Code: Added the generated Python alias path file to the ignore list.
Quality: Enabled auto-formatting for the Nuitka devcontainer configuration file.
Watch: Avoided absolute paths in compilation to make reports more comparable across machines.
Quality: Changed
mdformatchecks to run only once and silently.Scons: Disabled format security errors in debug mode and moved Python-related warning disables into common build setup code.
Quality: Updated to the latest
deepdiffversion.Scons: Avoided MSVC telemetry since it can produce outputs that break CI.
Debugging: Enhanced non-deployment handler for importing excluded modules.
Split import module finding functionality into more pieces for enhanced readability.
Debugging: Added more assertions for constants loading and checking.
macOS: Dropped the
universaltarget arch.Debugging: Added more traces for deep hash verification.
Summary
This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.
The --project option seems usable now.
Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.
The Python Coding Stack
1. From Answer to Outcome
Something has shifted in how we use AI. We still talk about “chatbots” and “prompts” and “getting a good answer.” But underneath those familiar words, a different kind of system has been quietly taking shape. One that doesn’t just answer your question. One that does something about it.
This Agents Unpacked series is about that shift. Not the hype version, not the science-fiction version, but the practical reality of what it means when an AI system can act, remember, and persist — when it can take a goal and work toward it, rather than waiting for you to type the next message.
If you have used ChatGPT, Claude, or Gemini to help with your work, you already know the issue: the answer comes back, it’s good, and then... the real work begins. This series is about what happens when the AI can do some of that real work too.
The Half-Done Feeling
You ask an AI chatbot to draft a project proposal. It gives you a solid one — well-structured, sensible, ready to polish. Then you close the chat window, and the proposal lives nowhere. The assistant doesn’t remember it. It doesn’t know where to file it, who needs to see it, or what happened the last time you wrote a similar proposal. If you come back tomorrow, you’re starting from scratch.
Or you ask it to research a topic. It gives you a good summary. But it didn’t check your existing notes first, it didn’t save what it found, it didn’t organise the sources, and it didn’t flag the gaps. The answer is useful. The process is incomplete.
This isn’t a criticism of chatbots. They do exactly what they were designed to do: you ask, they answer. The problem is that real work doesn’t stop at the answer. The proposal needs filing, the research needs organising, the plan needs tracking. The chatbot gave you a great starting point and then left you to do everything that comes after.
That gap between a useful answer and a finished job is where agentic AI enters the picture.
What Changes When AI Can Act
A chatbot is like a consultant you can phone. You describe the problem, they give you advice, and then you hang up and do the work yourself. Good advice (hopefully), but the consultant doesn’t pick up the phone again tomorrow and ask how it went.
An agent is different. It’s more like a capable colleague you’ve given a desk, a filing cabinet, and access to your systems. It has a workspace. It can remember what happened yesterday. It can read files, search the web, send messages, run calculations, ask you questions — not just talk about doing those things, but actually do them. And given a goal, it can work out the steps itself.
The difference sounds small. In practice it changes what you can delegate, what you can automate, and what you still need to do yourself.
The Loop Underneath
Every agent runs on an agentic loop. It sounds technical, but the pattern is surprisingly familiar:
Observe — What is the situation? What did the user ask? What information is already available?
Think — What needs to happen next? Is one step enough, or should this be broken into parts?
Act — Use a tool, look something up, write a file, send a message.
Repeat — Look at what happened, decide whether the job is done, and continue if it isn’t.
A chatbot usually observes your message, thinks once, and replies. That’s steps one through three, then stop. An agent goes further: after it acts, it looks at the result. Was that enough? Did the web search return useful information, or does it need to try again with different terms? Did the first draft cover everything, or are there gaps to fill? Is the job actually done, or is there more to do?
You might be thinking: chatbots already learned to use tools. Isn’t the agentic loop just that, plus one more step? Four items instead of three. But the gap between “can use a tool” and “decides whether to keep going” is not small. It is the difference between a system that performs a task and a system that pursues a goal. The first is impressive. The second changes what you can trust it to do unsupervised.
That continuation — the decision to check, adjust, and keep going — is what makes the loop matter. A single action might solve a simple request. But most real tasks aren’t simple — they need a sequence of steps, each one informed by the result of the last.
The loop is the architecture that turns language into work. Without it, you have a very clever answering system. With it, you have something that can move through a task, make intermediate decisions, recover from partial failures, and stop only when the job is actually done.
Stephen: I don’t get why the ‘Repeat’ step is needed? Wouldn’t the ‘Act’ provide the output I need?
A single action might solve a simple request. But most real tasks aren’t simple — they need a sequence of steps, each one informed by the result of the last. The ‘Act’ step does produce an output. But the output is not the same as the outcome.
After ‘Act’ runs, the agent looks at what happened: Was that enough? Did the web search return useful information, or does it need different terms? Did the first draft cover everything, or are there gaps to fill? Is the job actually done, or is there more to do?
That check — that ‘Repeat’ step — is what closes the gap between a technically complete action and a genuinely finished job. Without it, you have a system that acts and stops. With it, you have a system that works until the job is actually done.
Tools Are the Hands
An agent isn’t just “a better language model.” It has capabilities — things it can actually do in the world. Those are called tools.
Some tools are built in: read files, write files, run commands, search the web, inspect images. Others are external: send emails, query databases, call APIs, trigger workflows. The specific tools vary by platform, but the principle is the same — tools are the bridge between thinking and doing.
Think of it like this: an LLM on its own is like a brilliant mind with no hands. Tools give it hands. The loop is what decides when and how to use them.
Keeping Track
An agent can also hold information across steps and sessions. It remembers what it has already tried, what worked, what you prefer, what is still outstanding. This is not a personality trait. It is the system keeping relevant context available over time — the same way you rely on a notebook or a project board when you are working on something complex.
Stephen: So when the agent remembers my preferences, that’s like what you’ve done to write this article. You learnt my learning style, read my writing (including my writing about technical writing), absorbed my preferences, so you can adapt how you explain things. Is that roughly right?
That is exactly right. What an agent does with memory is not mysterious. It is practical. The system can store what you told it, what it observed, what it tried, and what the result was. When you come back, it can pick up where it left off. When it is working on a long task, it can hold the overall goal in view while handling the individual steps. That continuity is what turns a series of disconnected exchanges into something that feels like a sustained conversation.
You do not have to repeat yourself. The agent remembers. That is the difference — not just remembering facts, but maintaining a thread.
A Concrete Example
Let’s make this less abstract.
Imagine you ask for help planning a short research trip to a city you haven’t visited before. You need flights, accommodation, a sense of the neighbourhoods, and a rough itinerary that fits your schedule.
A chatbot might give you an excellent summary of the city, suggest some hotels, and recommend a few neighbourhoods. That’s genuinely useful. But then you have to: check whether those hotels are actually available on your dates, compare prices, figure out which neighbourhood works best for your meetings, build the itinerary around your existing calendar, and keep track of it all so you can adjust later.
An agent can take that same request and do something different.
It might:
Check your calendar for available dates before suggesting anything
Search for flights and filter by your preferred departure times
Cross-reference hotel locations against the addresses you need to visit
Build a day-by-day itinerary that accounts for travel time between meetings
Save the whole plan somewhere you can find it and update it later
Flag the gaps — “I found flights and hotels, but you haven’t told me whether you need a visa”
Filter results by your stored preferences — early-morning flights, boutique hotels
Use your past travel patterns to anticipate the kind of trip that fits your style
The agent doesn’t just tell you about the city. It assembles a usable, integrated plan. It uses tools to search, compare, read your calendar, write the plan, and flag what is missing. It loops through those actions until the trip is actually planned or until it hits something it cannot resolve without your input.
That is the difference between an answer and an outcome.
The Key Transition
The shift to understanding agents is not about capabilities. It is about the move from isolated exchanges to sustained work.
A chatbot gives you an answer. An agent helps you reach an outcome. One is a single exchange; the other is a process. One is clever; the other is useful in a different way — not smarter, but more continuous.
The loop is what makes that continuity possible. Observe, think, act, check what happened, adjust, and continue until the work is done.
Once you see that pattern, you start to notice it everywhere. A junior colleague troubleshooting a problem is running a loop: try something, see if it worked, try something else. A project manager steering a complex task is running a loop: check the status, identify what needs attention, act, and review.
The same pattern appears everywhere in life. A cook adjusts a recipe by tasting as they go. A teacher tries an explanation, sees whether the student understood, and tries a different approach if they didn’t.
Stephen: So, tell me if I got this right: The agent loop is mimicking how we, humans, work.
We understand the problem, get the relevant context, perform an action, look at the result of our action and then decide whether that solves our problem. If it doesn’t, we explore why, come up with a new plan, implement the new actions, and repeat the process.
It feels like the agent is going through the same process.
That is exactly right. The agent loop isn’t some exotic new form of intelligence. It’s a pattern humans use all day every day, made explicit and embedded in a system that can act. The insight isn’t that the AI has become more intelligent. It’s that the AI has gained the ability to persist, to use tools, and to continue — the same things that turn a one-off answer into real, completed work.
That is why this moment in AI feels different from previous ones. It is not that models suddenly became smarter. It is that they gained the ability to take action in a loop, over time, toward a goal.
What This Series Will Do
This series is for people like me (Stephen, not Priya!), who already understand how LLMs work, who have used chatbots like ChatGPT or Claude, and who are now hearing about “agents” and wondering what that actually means. I felt I was lagging in the AI world so I started this learning process to make sure I’m not left behind!
We will move beyond the half-done feeling and into the architecture of agentic systems. We will look at what agents actually are, how they are structured, what tools and skills give them their power, how multiple agents can coordinate, and how to think about trust, evaluation, and oversight.
We will also look at the platforms and frameworks that exist today — what they offer, how they differ, and what tradeoffs you are choosing when you pick one. This is not a manual for any one platform. It is a guide to understanding the category itself, so you can make good choices about whether, where, and how to use agentic AI in your own work.
The field moves fast. Some tools that looked promising six months ago may be superseded by the time you read this. That is fine. The principles matter more than the products. If you understand the loop, the anatomy, and the tradeoffs, you can look at whatever the current landscape happens to be and know what you are seeing.
Here’s the draft Table of Contents of this series in Agents Unpacked. This is likely to change as Priya and I progress through this project:
Part I — The Mental Shift
Chapter 1 — From Answer to Outcome (this post)
Chapter 2 — Anatomy of an Agent
Chapter 3 — Skills, Tools, and the Action Loop
Part II — Agents in Practice
Chapter 4 — Why One Agent Is Often Not Enough
Chapter 5 — Where Agents Are Actually Useful
Chapter 6 — Delegation Design
Part III — Building and Trusting Agentic Work
Chapter 7 — Designing Your First Agentic Workflow
Chapter 8 — When Things Go Wrong: Evaluation, Guardrails, and Trust
Chapter 9 — What to Build Next
<< Previous Post: Stephen’s Preface to Agents Unpacked
>> Next Post: Coming Soon
Table of Contents • Agents Unpacked
Here’s the planned Table of Contents. This is likely to change as Priya and I work on this:
Part I — The Mental Shift
Chapter 1 — From Answer to Outcome
Chapter 2 — Anatomy of an Agent
Chapter 3 — Skills, Tools, and the Action Loop
Part II — Agents in Practice
Chapter 4 — Why One Agent Is Often Not Enough
Chapter 5 — Where Agents Are Actually Useful
Chapter 6 — Delegation Design
Part III — Building and Trusting Agentic Work
Chapter 7 — Designing Your First Agentic Workflow
Chapter 8 — When Things Go Wrong: Evaluation, Guardrails, and Trust
Chapter 9 — What to Build Next
Stephen's Preface to Agents Unpacked
Like many, I started using chatbots when GPT whatever-version-it-was came out and took the world by storm. It was really not very good at the time (compared to today’s top-end chatbots), but it was clearly the start of something.
But things moved quickly, and I couldn’t quite catch up. I was busy with, you know, actual work, family, and life.
Then I started hearing lots of new terms, lots of new acronyms. I didn’t know what they were. I still don’t know what most of them are.
Then it was all about agents. I remember clearly thinking to myself: “Is this really any different from the ChatGPT-type chatbots?”
And here’s where this new series comes in. I decided to dive into agents and created a few. One of them is a learning tutor agent that I personalised to suit me.
I gave the agent all my tutorials and books. I gave the agent all the articles I wrote about my views on learning and technical writing.
I asked the agent to figure out from all this how I like to learn, how I like to communicate. I teach the way I like to learn, so it’s fine to put my teaching style in the mix. Then, I had a good long chat with my learning tutor agent to make sure we’re on the same page.
I gave a name to my agent (I named all my agents!) My personalised tutor is Priya.
This series is the joint effort between Priya and me to help me understand agents. What are they? How do they work?
Yes, it’s AI-generated content. But...
It’s generated by an AI that’s extremely well-versed in my style of learning and communication.
I had an active role in steering and editing the content. Here’s how...
What you’ll read in the following articles was created using the following process:
Priya researched the topic following my brief and created a course outline.
She drafted the first chapter.
I read through the file, leaving comments and questions along the way, directly within the text.
I marked some comments as private.
I marked some comments as public.
Priya revised the chapter by incorporating my comments and questions.
She deleted the private comments from the text after making changes to address my comments.
She kept my public comments and questions in the text, clearly marked as “Stephen’s questions”, and she answered them directly in the text.
I reviewed the chapter again and left more comments, and Priya revised the chapter again. We iterated through this until I was happy with the final text. And I was happy with the final text when I felt I understood everything in it and all my questions had been answered.
Priya then moves on to draft the second chapter, and the whole process starts again.
I will post these chapters as they emerge from this process. This is how I’m learning this topic. Hopefully, they may help some other people, too.
Here’s the planned table of contents for this Agents Unpacked series. But note, this may change!
Stephen’s Preface to Agents Unpacked (this post)
Part I — The Mental Shift
Chapter 1 — From Answer to Outcome
Chapter 2 — Anatomy of an Agent
Chapter 3 — Skills, Tools, and the Action Loop
Part II — Agents in Practice
Chapter 4 — Why One Agent Is Often Not Enough
Chapter 5 — Where Agents Are Actually Useful
Chapter 6 — Delegation Design
Part III — Building and Trusting Agentic Work
Chapter 7 — Designing Your First Agentic Workflow
Chapter 8 — When Things Go Wrong: Evaluation, Guardrails, and Trust
Chapter 9 — What to Build Next









