Planet Python
Last update: May 20, 2026 09:44 PM UTC
May 20, 2026
Django Weblog
Django 6.1 alpha 1 released
Django 6.1 alpha 1 is now available. It represents the first stage in the 6.1 release cycle and is an opportunity to try out the changes coming in Django 6.1.
Django 6.1 offers a harmonious mélange of new features and usability improvements, which you can read about in the in-development 6.1 release notes.
This alpha milestone marks the feature freeze. The current release schedule calls for a beta release in about a month and a release candidate roughly a month after that. We'll only be able to keep this schedule with early and frequent testing from the community. Updates on the release schedule are available on the Django forum.
As with all alpha and beta packages, this release is not for production use. However, if you'd like to take some of the new features for a spin, or help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the alpha package from our downloads page or on PyPI.
The PGP key ID used for this release is Jacob Walls: 131403F4D16D8DC7
death and gravity
reader 3.24 released – help, multi-user updates
Hi there!
I'm happy to announce version 3.24 of reader, a Python feed reader library.
What's new? #
Here are the highlights since reader 3.23.
Context-sensitive help #
In lieu of a tutorial mode, the web app now offers guidance to new users, and has a basic context-sensitive help system. Here's some screenshots:
new user / empty state
context-sensitive help
also help
Structured logging #
reader now uses structured logging internally, through structlog.
By default, output goes to stdlib logging, but you can opt into structlog-native logging:
import reader, structlog
reader.enable_structlog()
structlog.configure(...)
This was relatively challenging to do, since as a library, you cannot configure logging, nor change any global state. I hope I can contribute a variant of the solution upstream, but meanwhile here's a recipe you can use in your library (warning: brittle code).
Make update_feeds() parallel again #
It turns out the "extensive rework of the parser internal API" from 3.15 caused update_feeds() to retrieve feeds in the main thread regardless of the worker count.
Protip
If you have a parallel map() that returns @contextmanagers,
make sure the work you need to do in parallel
doesn't happen in __enter__. 😅
New contributors #
Thank you to the new contributors that submitted pull requests to this release!
Want to contribute? Check out the docs and the roadmap.
Hosted reader status update #
As I said last time, I'm working on a hosted version of reader. Background: Why another feed reader web app?, Why not just self-host it?.
Multi-user feed updates #
One of the bigger changes for hosted reader was handling multi-user feed updates.
For intentional but questionable reasons, users have their own dedicated databases, with the web app routing to the appropriate one based on session information.
However, updating feeds should happen in a single, shared database; this allows:
- retrieving feeds once, not once per user
- per-host rate limiting
- preserving a longer history for public feeds
This is now done, complete with a design document (to be published). As a teaser, here's a neat architecture / data flow diagram:
... user@2.sqlite user nginx Flask auth app auth.sqlite user@1.sqlite public shared.sqlite feeds public private email yes, it's web scale ಠ_ಠOK, so what now? #
Since I'm rapidly running out of technical things to do, a launch is imminent.
This is what is finished so far:
- multi-user version of the web app
- authentication via email
- infrastructure deployments using pyinfra
- (new) multi-user feed updates
- (new) tutorial mode – context-sensitive help should do
Remaining work to an MVP:
- public demo
- landing page
- give it a good name
- launch announcement + roadmap
Meanwhile, if this sounds like something you'd like to use, get in touch.
That's it for now. For more details, see the full changelog.
Learned something new today? Share it with others, it really helps!
What is reader? #
reader takes care of the core functionality required by a feed reader, so you can focus on what makes yours different.
reader allows you to:
- retrieve, store, and manage Atom, RSS, and JSON feeds
- mark articles as read or important
- add arbitrary tags/metadata to feeds and articles
- filter feeds and articles
- full-text search articles
- get statistics on feed and user activity
- import / export feeds as OPML
- write plugins to extend its functionality
...all these with:
- a stable, clearly documented API
- excellent test coverage
- fully typed Python
To find out more, check out the GitHub repo and the docs, or give the tutorial a try.
Why use a feed reader library? #
Have you been unhappy with existing feed readers and wanted to make your own, but:
- never knew where to start?
- it seemed like too much work?
- you don't like writing backend code?
Are you already working with feedparser, but:
- want an easier way to store, filter, sort and search feeds and entries?
- want to get back type-annotated objects instead of dicts?
- want to restrict or deny file-system access?
- want to change the way feeds are retrieved by using Requests?
- want to also support JSON Feed?
- want to support custom information sources?
... while still supporting all the feed types feedparser does?
If you answered yes to any of the above, reader can help.
The reader philosophy #
- reader is a library
- reader is for the long term
- reader is extensible
- reader is stable (within reason)
- reader is simple to use; API matters
- reader features work well together
- reader is tested
- reader is documented
- reader has minimal dependencies
Real Python
How to Use the Claude API in Python
The fastest way to use the Claude API in Python is to install anthropic, set your API key, and call client.messages.create(). You’ll have a working response in under a minute:
Example of Using the Claude API in Python
Claude is Anthropic’s large language model, accessible via a clean REST API with an official Python SDK. Unlike heavier AI frameworks that require you to wire up multiple components before you see any output, the anthropic package gets you to a working response in a handful of lines.
In the following steps, you’ll install the anthropic SDK, call Claude from Python, shape Claude’s behavior with a system prompt, and then return structured JSON output using a schema or Pydantic.
Note: Claude’s responses are non-deterministic, so the same prompt produces different output each time, which is expected for a large language model. Also, API calls cost money based on the number of tokens processed. Keep an eye on your usage in the Claude Console as you follow along.
Each step builds on the last, and the final script is short enough to read in one sitting but complete enough to extend into a real application of your own.
Get Your Code: Click here to download the free sample code that shows you how to use the Claude API in Python.
Take the Quiz: Test your knowledge with our interactive “How to Use the Claude API in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
How to Use the Claude API in PythonTest your understanding of using the Claude API in Python. Send prompts, set system instructions, and return structured JSON with a schema.
Prerequisites
Before diving in, make sure you have the following in place:
-
Python knowledge: You should be comfortable with Python basics, like defining functions, running scripts from the terminal, and working with virtual environments. If virtual environments are new to you, Python Virtual Environments: A Primer has you covered before you continue.
-
Python 3.9 or higher: The
anthropicSDK requires Python 3.9 as a minimum. If you’re not sure which version you have, runpython --versionin your terminal. If you need to install or upgrade, follow the steps in the guide on installing Python. -
An Anthropic account: You’ll need an Anthropic account to generate an API key in the Claude Console. Step 1 will show you how to find and secure your key once you’re in.
Don’t worry if you’ve never worked with an API before. This tutorial will walk you through authentication and help you make your first request from scratch.
Step 1: Set Up the Claude API in Python
Before you can call Claude from Python, you need an API key and the anthropic package installed. By the end of this step, you’ll have both, and Claude will be responding to your first prompt.
Get Your API Key and Install anthropic
Log in to the Claude Console or create a new account. If you’re starting fresh, you can begin using the API after adding $5 of credits.
Then navigate to the API Keys section. Click Create Key, give it a descriptive name like real-python-tutorial, and copy it immediately. You won’t see it again after you close the dialog.
Note: Never paste your API key directly into your code. Instead, store it as an environment variable. The anthropic SDK automatically reads it from ANTHROPIC_API_KEY at runtime, so you never need to reference it explicitly in your scripts.
Storing your key as an environment variable means it never touches your source code or version control history. The exact command depends on your operating system:
With your API key stored safely, you’re ready to install the SDK. Create a fresh virtual environment and activate it before installing anything. This isolation prevents the anthropic package from conflicting with your system-level tools.
Send Your First Prompt
Read the full article at https://realpython.com/claude-api-python/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: How to Use the Claude API in Python
In this quiz, you’ll test your knowledge of How to Use the Claude API in Python.
By working through this quiz, you’ll revisit how to install the anthropic SDK, send prompts to Claude with client.messages.create(), shape responses with a system parameter, and return structured JSON output using a schema or Pydantic.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python GUIs
Adding QComboBox to a QTableView and getting/setting values after creation — Use QItemDelegate to embed combo boxes in your table views, with per-row data and value tracking
I'm using a QTableView to display data, and would like to limit the choices in some of the fields using a drop-down. I can use
QComboBoxto provide a list of choices in a normal UI, but how can I do that in a table view?
When you're working with QTableView in PyQt6, you'll sometimes want cells that offer a dropdown selection instead of plain text. A QComboBox is the natural fit here — but embedding one inside a table view takes a bit of wiring up.
In this tutorial, we'll walk through how to use a QItemDelegate to place a QComboBox into specific cells of a QTableView. We'll also cover how to populate each combo box with different items per row, and how to retrieve the selected value so you can use it elsewhere in your application.
How delegates work in Qt's Model/View framework
Qt's Model/View architecture separates your data (the model) from how it's displayed (the view). Between these two sits the delegate, which controls how individual cells are rendered and edited. When you want a cell to use a widget like a combo box instead of a plain text editor, you create a custom delegate.
The delegate has a few methods you'll override:
createEditor()— creates the widget (in our case, aQComboBox) when the user starts editing a cell.setEditorData()— populates the editor widget with the current data from the model.setModelData()— writes the user's selection back into the model.updateEditorGeometry()— makes sure the widget is sized and positioned correctly inside the cell.
Let's build this up step by step.
Setting up the model and view
First, let's create a simple application with a QTableView and a QStandardItemModel. Each row will represent a software package, and one of the columns will hold a list of available versions. We'll store those version lists directly in the model data, so each row can have its own set of options.
import sys
from PyQt6.QtWidgets import (
QApplication, QMainWindow, QTableView, QComboBox, QItemDelegate,
)
from PyQt6.QtGui import QStandardItemModel, QStandardItem
from PyQt6.QtCore import Qt, QItemDataRole
class MainWindow(QMainWindow):
def __init__(self):
super().__init__()
self.setWindowTitle("QComboBox in QTableView")
self.table = QTableView()
self.setCentralWidget(self.table)
# Create a model with 3 rows and 2 columns.
self.model = QStandardItemModel(3, 2)
self.model.setHorizontalHeaderLabels(["Package", "Version"])
# Each row has a package name and a list of available versions.
packages = [
("Widget Library", ["1.0", "1.1", "2.0", "2.1"]),
("Data Toolkit", ["0.9", "1.0"]),
("Render Engine", ["3.0", "3.1", "3.2", "4.0"]),
]
for row, (name, versions) in enumerate(packages):
# Column 0: package name (plain text).
self.model.setItem(row, 0, QStandardItem(name))
# Column 1: store the version list in the item's data.
# We use Qt.ItemDataRole.UserRole to keep the full list alongside the display text.
item = QStandardItem(versions[-1]) # Display the latest version by default.
item.setData(versions, Qt.ItemDataRole.UserRole)
self.model.setItem(row, 1, item)
self.table.setModel(self.model)
# Apply our custom delegate to column 1.
delegate = ComboDelegate(self.table)
self.table.setItemDelegateForColumn(1, delegate)
self.resize(400, 200)
Notice how we store the list of versions using Qt.ItemDataRole.UserRole. This is a custom data role — it lets us attach extra information to a model item without interfering with the text that's displayed (which uses Qt.ItemDataRole.DisplayRole). Each row gets its own version list, so when the combo box opens, it will show only the versions relevant to that row.
Creating the combo box delegate
Now let's write the ComboDelegate class. This is where the combo box gets created and connected to the model.
class ComboDelegate(QItemDelegate):
"""
A delegate that places a QComboBox in cells of the assigned column.
"""
def createEditor(self, parent, option, index):
# Create the combo box and populate it with the version list for this row.
combo = QComboBox(parent)
versions = index.data(Qt.ItemDataRole.UserRole)
if versions:
combo.addItems(versions)
return combo
def setEditorData(self, editor, index):
# Set the combo box to show the currently selected value.
current_text = index.data(Qt.ItemDataRole.DisplayRole)
idx = editor.findText(current_text)
if idx >= 0:
editor.setCurrentIndex(idx)
def setModelData(self, editor, model, index):
# Write the selected value back into the model.
model.setData(index, editor.currentText(), Qt.ItemDataRole.DisplayRole)
def updateEditorGeometry(self, editor, option, index):
editor.setGeometry(option.rect)
Let's walk through each method:
createEditor() is called when the user double-clicks (or otherwise activates) a cell in column 1. We create a fresh QComboBox, pull the version list from Qt.ItemDataRole.UserRole for that specific row, and add those items to the combo box. Because each row stores its own list, different rows will show different options.
setEditorData() makes sure the combo box starts with the right item selected. We read the current display text from the model and find the matching entry in the combo box.
setModelData() fires when the user finishes editing (for example, by clicking away from the cell). It takes whatever the user selected in the combo box and writes it back into the model's DisplayRole.
updateEditorGeometry() simply ensures the combo box fills the cell neatly.
Running the application
Add the standard entry point at the bottom of your script:
app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec())
Run the script and double-click any cell in the "Version" column. You'll see a combo box appear with the version options for that specific row. Select a value, click away, and the cell updates.

Getting the selected value
After the user makes a selection, the value is stored in the model. You can read it at any time:
# Read the selected version for row 0.
selected = self.model.item(0, 1).text()
print(f"Row 0 selected version: {selected}")
If you want to react immediately when a selection changes, you can connect to the model's dataChanged signal. If you're new to how signals work in Qt, see our guide on signals, slots and events:
self.model.dataChanged.connect(self.on_data_changed)
def on_data_changed(self, top_left, bottom_right, roles):
if top_left.column() == 1:
row = top_left.row()
value = top_left.data(Qt.ItemDataRole.DisplayRole)
print(f"Row {row} version changed to: {value}")
This approach keeps things nicely separate — you're working through the model rather than trying to hold references to individual combo box widgets. The combo boxes are created and destroyed as the user interacts with cells.
Setting a value programmatically
To change a cell's value from code, update the model directly:
# Set row 2's version to "3.1".
self.model.item(2, 1).setText("3.1")
The next time the user opens the combo box on that row, the delegate's setEditorData() will position the combo box on "3.1".
You can also update the list of available versions for a row:
# Add a new version to row 1's options.
item = self.model.item(1, 1)
versions = item.data(Qt.ItemDataRole.UserRole)
versions.append("1.1")
item.setData(versions, Qt.ItemDataRole.UserRole)
Why each row gets its own combo box items
A common stumbling block is ending up with the same items in every combo box across the column. This happens when you store the item list on the delegate itself (as a single shared list) rather than on the model. Since the delegate is shared across all rows, any list stored on it will be the same everywhere.
The solution, as we've done here, is to store per-row data in the model using Qt.ItemDataRole.UserRole. Each call to createEditor() reads from the specific index it's given, so each row naturally gets its own set of options. This is a pattern you'll use often when different rows need different editor configurations.
Complete code
Here's the full working example in one block:
import sys
from PyQt6.QtWidgets import (
QApplication, QMainWindow, QTableView, QComboBox, QItemDelegate,
)
from PyQt6.QtGui import QStandardItemModel, QStandardItem
from PyQt6.QtCore import Qt
class ComboDelegate(QItemDelegate):
"""
A delegate that places a QComboBox in cells of the assigned column.
"""
def createEditor(self, parent, option, index):
combo = QComboBox(parent)
versions = index.data(Qt.ItemDataRole.UserRole)
if versions:
combo.addItems(versions)
return combo
def setEditorData(self, editor, index):
current_text = index.data(Qt.ItemDataRole.DisplayRole)
idx = editor.findText(current_text)
if idx >= 0:
editor.setCurrentIndex(idx)
def setModelData(self, editor, model, index):
model.setData(index, editor.currentText(), Qt.ItemDataRole.DisplayRole)
def updateEditorGeometry(self, editor, option, index):
editor.setGeometry(option.rect)
class MainWindow(QMainWindow):
def __init__(self):
super().__init__()
self.setWindowTitle("QComboBox in QTableView")
self.table = QTableView()
self.setCentralWidget(self.table)
self.model = QStandardItemModel(3, 2)
self.model.setHorizontalHeaderLabels(["Package", "Version"])
packages = [
("Widget Library", ["1.0", "1.1", "2.0", "2.1"]),
("Data Toolkit", ["0.9", "1.0"]),
("Render Engine", ["3.0", "3.1", "3.2", "4.0"]),
]
for row, (name, versions) in enumerate(packages):
self.model.setItem(row, 0, QStandardItem(name))
item = QStandardItem(versions[-1])
item.setData(versions, Qt.ItemDataRole.UserRole)
self.model.setItem(row, 1, item)
self.table.setModel(self.model)
delegate = ComboDelegate(self.table)
self.table.setItemDelegateForColumn(1, delegate)
# React to changes.
self.model.dataChanged.connect(self.on_data_changed)
self.resize(400, 200)
def on_data_changed(self, top_left, bottom_right, roles):
if top_left.column() == 1:
row = top_left.row()
value = top_left.data(Qt.ItemDataRole.DisplayRole)
print(f"Row {row} version changed to: {value}")
app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec())
Wrapping up
Using a custom QItemDelegate gives you full control over how cells in a QTableView are edited. By storing per-row data in the model with Qt.ItemDataRole.UserRole, you can give each combo box its own set of items — solving the common problem of all combo boxes showing the same options.
The pattern here — store data in the model, read it in the delegate, write changes back to the model — works well beyond combo boxes. You can use the same approach to embed spin boxes, date pickers, or any other widget into your table cells. Once you're comfortable with this flow, you'll find Qt's Model/View framework surprisingly flexible. For a deeper dive into using QTableView with real-world data sources like NumPy and Pandas, see our QTableView with numpy and pandas tutorial. You can also explore how to make table cells editable for other common editing patterns.
For an in-depth guide to building Python GUIs with PyQt6 see my book, Create GUI Applications with Python & Qt6.
May 19, 2026
Kay Hayen
Nuitka Release 4.1
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.
Bug Fixes
Python 3.14: Fix, decorators were breaking when disabling deferred annotations. (Fixed in 4.0.1 already.)
Fix, nested loops could have wrong traces lead to mis-optimization. (Fixed in 4.0.1 already.)
Plugins: Fix, run-time check of package configuration was incorrect. (Fixed in 4.0.1 already.)
Compatibility: Fix,
__builtins__lacked necessary compatibility in compiled functions. (Fixed in 4.0.1 already.)Distutils: Fix, incorrect UTF-8 decoding was used for TOML input file parsing. (Fixed in 4.0.1 already.)
Fix, multiple hard value assignments could cause compile time crashes. (Fixed in 4.0.1 already.)
Fix, string concatenation was not properly annotating exception exits. (Fixed in 4.0.2 already.)
Windows: Fix,
--verbose-outputand--show-modules-outputdid not work with forward slashes. (Fixed in 4.0.2 already.)Python 3.14: Fix, there were various compatibility issues including dictionary watchers and inline values. (Fixed in 4.0.2 already.)
Python 3.14: Fix, stack pointer initialization to
localspluswas incorrect to avoid garbage collection issues. (Fixed in 4.0.2 already.)Python 3.12+: Fix, generic type variable scoping in classes was incorrect. (Fixed in 4.0.2 already.)
Python 3.12+: Fix, there were various issues with function generics. (Fixed in 4.0.2 already.)
Python 3.8+: Fix, names in named expressions were not mangled. (Fixed in 4.0.2 already.)
Plugins: Fix, module checksums were not robust against quoting style of module-name entry in YAML configurations. (Fixed in 4.0.2 already.)
Plugins: Fix, doing imports in queried expressions caused corruption. (Fixed in 4.0.2 already.)
UI: Fix, support for
uv_buildin the--projectoption was broken. (Fixed in 4.0.2 already.)Compatibility: Fix, names assigned in assignment expressions were not mangled. (Fixed in 4.0.2 already.)
Python 3.12+: Fix, there were still various issues with function generics. (Fixed in 4.0.3 already.)
Clang: Fix, debug mode was disabled for clang generally, but only ClangCL and macOS Clang didn’t want it. (Fixed in 4.0.3 already.)
Zig: Fix,
--windows-console-mode=attach|disablewas not working when using Zig. (Fixed in 4.0.3 already.)macOS: Fix, yet another way self dependencies can look like, needed to have support added. (Fixed in 4.0.3 already.)
Python 3.12+: Fix, generic types in classes had bugs with multiple type variables. (Fixed in 4.0.3 already.)
Scons: Fix, repeated builds were not producing binary identical results. (Fixed in 4.0.3 already.)
Scons: Fix, compiling with newer Python versions did not fall back to Zig when the developer prompt MSVC was unusable, and error reporting could crash. (Fixed in 4.0.4 already.)
Zig: Fix, the workaround for Windows console mode
attachordisablewas incorrectly applied on non-Windows platforms. (Fixed in 4.0.4 already.)Standalone: Fix, linking with Python Build Standalone failed because
libHacl_Hash_SHA2was not filtered out unconditionally. (Fixed in 4.0.4 already.)Python 3.6+: Fix, exceptions like
CancelledErrorthrown into an async generator awaiting an inner awaitable could be swallowed, causing crashes. (Fixed in 4.0.4 already.)Fix, not all ordered set modules accepted generators for update. (Fixed in 4.0.5 already.)
Plugins: Disabled warning about rebuilding the
pytokensextension module. (Fixed in 4.0.5 already.)Standalone: Filtered
libHacl_Hash_SHA2from link libs unconditionally. (Fixed in 4.0.5 already.)Debugging: Disabled unusable unicode consistency checks for Python versions 3.4 to 3.6. (Fixed in 4.0.5 already.)
Python3.12+ Avoided cloning call nodes on class level which caused issues with generic functions in combination with decorators. (Added in 4.0.5 already.)
Python 3.12+: Added support for generic type variables in
async deffunctions. (Added in 4.0.5 already.)UI: Fix, flushing outputs for prompts was not working in all cases when progress bars were enabled. (Fixed in 4.0.6 already.)
UI: Fix, unused variable warnings were missing at C compile time when using
zigas a C compiler. (Fixed in 4.0.6 already.)Scons: Fix, forced stdout and stderr paths as a feature was broken. (Fixed in 4.0.6 already.)
Fix, replacing a branch did not accurately track shared active variables causing optimization crashes. (Fixed in 4.0.7 already.)
macOS: Fix, failed to remove extended attributes because files need to be made writable first. (Fixed in 4.0.7 already.)
Fix, dict
popandsetdefaultusing with:=rewrites lacked exception-exit annotations for un-hashable keys. (Fixed in 4.0.8 already.)Python 3.13: Fix, the
__parameters__attribute of generic classes was not working. (Fixed in 4.0.8 already.)Python 3.11+: Fix, starred arguments were not working as type variables. (Fixed in 4.0.8 already.)
Python2: Fix,
FileNotFoundErrorcompatibility fallback handling was not working properly. (Fixed in 4.0.8 already.)Compatibility: Fix, loop ownership check in value traces was missing, causing issues with nested loops.
Windows: Improved
--windows-console-mode=attachto properly handle console handles, enabling cases likeos.systemto work nicely.Python2: Fix, there was a compatibility issue where providing default values to the
mkdtempfunction was failing.Windows: Fix, there were spurious issues with C23 embedding in 32-bit MinGW64 by switching to
coff_objresource mode for it as well.Plugins: Fix, the
post-import-codeexecution could fail because the triggering sub-package was not yet available insys.modules.UI: Fix, listing package DLLs with
--list-package-dllswas broken due to recent plugin lifecycle changes.UI: Fix,
--list-package-exewas not working properly on non-Windows platforms failing to detect executable files correctly.UI: Handled paths starting with
{PROGRAM_DIR}the same as a relative path when parsing the--onefile-tempdir-specoption.Plugins: Followed multiprocessing
forkserverchanges for newer Python versions.Python 3.12+: Fix, generic class type parameters handling was incorrect.
Python 3.12: Fix, deferred evaluation of type aliases was failing.
Python 3.12+: Aligned
sumbuilt-in float summation with CPython’s compensated sum for better accuracy.Python 3.10+: Fix, uncompiled coroutine
throw()return handling was incorrect, restoring completed coroutine results viaStopIteration.valuerather than exposing them as ordinary return values to the outer await chain.Python 3.13+: Fix, uncompiled coroutine
cancel()/awaitsuspension handling was incorrect, improved to ensure integration compatibility.macOS: Made finding
create-dmgmore robustly by also checking the Homebrew path for Intel and fromPATHproperly.Compatibility: Fix, class frames were not exposing frame locals.
UI: Detected
static-libpythonproblems, which affected some forms of Anaconda.Distutils: Rejected
--projectmixed with--mainarguments as it is not useful.macOS: Fix,
zigfromPATHor fromziglangwas not being used.Distutils: Fix, the wrong
module-rootconfig value was being checked foruvbuild backend.macOS: Fix, was attempting to change removed (rejected) DLLs, which of course failed and errored out.
Python 3.14: Fix, tuple reuse was not fully compatible, potentially causing crashes due to outdated hash caches.
Fix, fake modules were still being attempted to located when imported by other code, which could conflict with existing modules.
Python 3.5+: Fix, failed to send uncompiled coroutines the sent in value in
yield from.Fix, older
gcccompilers lacking newer intrinsic methods had compilation issues that needed to be addressed.Standalone: Fix, multiphase module extension modules with post-load code were not working properly.
Fix, Avoid using the non-inline copy of
pkg_resourceswith the inline copy of Jinja2. These could mismatch and cause errors.Fix, loops could make releasing of previous values very unclear, causing optimization errors.
Fix,
incbinresource mode was not working with oldgccC++ fallback.Python 3.4 to 3.6: Fix, bytecode demotion was not working properly for these versions, also bytecode only files not working.
Plugins: Added a check for the broken
patchelfversions 0.10 and 0.11 to prevent breaking Qt plugins.Android: Allowed
patchelfversion 0.18 on Android.Windows: Fix, the header path for self uninstalled Python was not detected correctly.
Release: Fix, inclusion of the
pkg_resourcesinline copy for Python 2 to source distributions was missing.UI: Detected the OBS versions of SUSE Linux better.
Suse: Allowed using
patchelf0.18.0 there too.Python 3.11: Fix, package and module dicts were not aligned close enough to avoid a CPython bug.
Fix, unbound compiled methods could crash when called without an object passed.
Standalone: Fix, multiphase module extension modules with postload. (Fixed in 4.0.8 already.)
Onefile: Fix, while waiting for the child, it may already be terminated.
macOS: Removed existing absolute rpaths for Homebrew and MacPorts.
Python 3.14: Avoided warning in CPython headers.
Python 3.14: Followed allocator changes more closely.
Compatibility: Avoided using
pkg_resourcesfor Jinja2 template location for loading.No-GIL: Applied some bug fixes to get basic things to work.
Package Support
Standalone: Add support for newer
paddleversion. (Added in 4.0.1 already.)Standalone: Add workaround for refcount checks of
pandas. (Fixed in 4.0.1 already.)Standalone: Add support for newer
h5pyversion. (Added in 4.0.2 already.)Standalone: Add support for newer
scipypackage. (Added in 4.0.2 already.)Plugins: Revert accidental
os.getenvoveros.environ.getchanges in anti-bloat configurations that stopped them from working. Affected packages arenetworkx,persistent, andtensorflow. (Fixed in 4.0.5 already.)Standalone: Added missing DLLs for
openvino. (Added in 4.0.7 already.)Enhanced the package configuration YAML schema by adding the
relative_toparameter forfrom_filenamesDLL specification, avoiding error-prone purely relative paths.Standalone: Fix,
flet_desktopapp assets were missing, now preserving the packaged runtime and sidecar DLLs.Standalone: Added support for the
tyropackage.Standalone: Added data files for the
perfettopackage.Standalone: Added support for
anyioprocess forking.Standalone: Added support for the
plotly.graphpackage.Anaconda: Fix, dependencies for the
numpyconda package on Windows were incorrect.Plugins: Enhanced the auto-icon hack in PySide6 to use compatible class names.
Standalone: Fix, Qt libraries were duplicated with
PySide6WebEngine framework support on macOS.Plugins: Fix, automatic detection of
mypycruntime dependencies was including all top level modules of the containing package by accident. (Fixed in 4.0.5 already.)Anaconda: Fix,
delvewheelplugin was not working with Python 3.8+. This enhances compatibility with installed PyPI packages that use it for their DLLs. (Fixed in 4.0.6 already.)Plugins: Fix, our protection workaround could confuse methods used with
PySide6.
New Features
UI: Added the
--recommended-python-versionoption to display recommended Python versions for supported, working, or commercial usage.UI: Add message to inform users about
Nuitka[onefile]if compression is not installed. (Added in 4.0.1 already.)UI: Add support for
uv_buildin the--projectoption. (Added in 4.0.1 already.)Onefile: Allow extra includes as well. (Added in 4.0.2 already.)
UI: Add
nuitka-project-setfeature to define project variables, checking for collisions with reserved runtime variables. (Added in 4.0.2 already.)Scons: Added new option to select
--reproduciblebuilds or not. (Added in 4.0.6 already.)Python 3.10+: Added support for
importlib.metadata.package_distributions(). (Added in 4.0.8 already.)Plugins: Added support for the multiprocessing
forkservercontext. (Added in 4.0.8 already, for 4.1 Python 3.6 and earlier, as well as 3.14 support were added too.)Reports: Added structured resource usage (
rusage) performance information to compilation reports.Reports: Included individual module-level C compiler caching (
ccache/clcache) statistics in compilation reports.Added support for detecting and correctly resolving the Python prefix for the
PyEnv on HomebrewPython flavor.macOS: Added support for
rusageinformation for Scons.UI: Added the
__compiled__.extension_filenameattribute to give the real filename of the containing extension module.Windows: Added support for
--clangor ARM. (Added in 4.0.8 already.)Windows: Added support for resources names as not just integers, important when we copy them from template files.
MacPorts: Added basic support for this Python flavor. More work will be needed to get it to work fully though.
Optimization
Avoid including
importlib._bootstrapandimportlib._bootstrap_external. (Added in 4.0.1 already.)Linux: Cached the
syscallused for time keeping during compilation to avoid loadinglibcfor each trace. (Added in 4.0.8 already.)UI: Output a warning for modules that remain unfinished after the third optimization pass.
Added an extra micro pass trigger when new variables are introduced or variable usage changes severely, ensuring optimizations are fully propagated, avoiding unnecessary extra full passes.
Provided scripts to compile Python statically with PGO tailored for Nuitka on Linux, Windows, and macOS.
Added support for running the Data Composer tool from a compiled Nuitka binary without spawning an uncompiled Python process.
Enhanced the usage of
vectorcallforPyCFunctionobjects by directly checking for its presence instead of relying purely on flags, allowing more frequent use of this faster execution path.Cached frequently used declarations for top-level variables to speed up C code generation.
Sped up trace collection merging by avoiding unnecessary set creation and using a set instead of a list for escaped traces.
Optimized plugin hook execution by tracking overloaded methods and added an option to show plugin usage statistics.
Improved performance of module location by avoiding unnecessary module name reconstruction and redundant filesystem checks for pre-loaded packages.
Improved the caching of distribution name lookups to effectively avoid repeated IO operations across all package types.
Plugins: Cached callback plugin dispatch for
onFunctionBodyParsingandonClassBodyParsingto skip argument computation when no plugin overrides them.Python 3.13: Handled sub-packages of
pathlibas hard modules.Handled hard attributes through merge traces as well.
Made constant blobs more compact by avoiding repeated identifiers and unnecessary fields.
Enhanced Python compilation scripts further. (Fixed in 4.0.8 already.)
Recognized late incomplete variables better. (Fixed in 4.0.8 already.)
Made constant blobs more compact. (Fixed in 4.0.8 already.)
Optimized calls with only constant keywords and variable posargs too.
Anti-Bloat
Fix, memory bloat occurred when C compiling
sqlalchemy. (Fixed in 4.0.2 already.)Avoid using
pydocinPySimpleGUI. (Added in 4.0.2 already.)Avoided using
doctestfromzodbpickle. (Added in 4.0.5 already.)Avoided inclusion of
cythonwhen usingpyav. (Added in 4.0.7 already.)Avoided including
typing_extensionswhen usingnumpy. (Added in 4.0.7 already.)
Organizational
UI: Relocated the warning about the available source code of extension modules to be evaluated at a more appropriate time.
Debian: Remove recommendation for
libfuse2package as it is no longer useful.Debian: Used
platformdirsinstead ofappdirs.Debugging: Removed Python 3.11+ restriction for
clang-formatas it is available everywhere, even Python 2.7, and we still want nicely formatted code when we read things. (Added in 4.0.6 already.)Removed no longer useful inline copy of
wax_off. We have our own stubs generator project.Release: Added missing package to the CI container for building Nuitka Debian packages.
Developer: Updated AI instructions for creating Minimal Reproducible Examples (MRE) to skip unneeded C compilation.
Debugging: Added an internal function for checking if a string is a valid Python identifier.
AI: Added a task in Visual Studio Code to export the currently selected Python interpreter path to a file, making it available as “python” and “pip” matching the selected interpreter. This makes it easier to use a specific version with no instructions needed.
AI: Updated the rules to instruct AI to only generate useful comments that add context not present in the code.
Containers: Added template rendering support for Jinja2 (
.j2) container files in our internal Podman tools.Projects: Clarified the current status and rationale of Python 2.6 support in the developer manual.
Debugging: Added experimental flag
--experimental=ignore-extra-micro-passto allow ignoring extra micro pass detection.Visual Code: Added integration scripts for
bashandzshautocompletion of Nuitka CLI options. These are now also integrated into Visual Studio Code terminal profiles and the Debian package.RPM: Included the Python compile script for Linux.
RPM: Removed the requirement for
distutilsin the spec.
Tests
Install only necessary build tools for test cases.
Avoided spurious failures in reference counting tests due to Python internal caching differences. (Fixed in 4.0.3 already.)
Fix, the parsing of the compilation report for reflected tests was incorrect.
Python 3.14: Ignored a syntax error message change.
Python 3.14: Added test execution support options to the main test runner to use this version as well.
Fix, the runner binary path was mishandled for the third pass of reflected compilations.
Removed the usage of obsolete plugins in reflected compilation tests.
Debugging: Prevented boolean testing of
namedtuplesto avoid unexpected bugs.Added the
Testsuffix to syntax test files and disabled “python” mode and spell checking for them to resolve issues reported in IDEs.Fix, newline handling in diff outputs from the output comparison tool was incorrect.
Covered
post-import-codefunctionality with a new subpackage test case.Prevented the program test suite from running an unnecessary variant to save execution time.
macOS: Ignored differences from GUI framework error traces in headless runs in output comparisons.
Reflected test for Nuitka, where it compiles itself and compares its operation has been restored to functional state.
Used the new method to clear internal caches if available for reference counts.
Disabled running nested loops test with Python 2.6.
Containers: Detected Python 2 defaulting containers in Podman tooling.
Cleanups
UI: Fix, there was a double space in the Windows Runtime DLLs inclusion message. (Fixed in 4.0.1 already.)
Onefile: Separated files and defines for extra includes for onefile boot and Python build.
Scons: Provided nicer errors in case of “unset” variables being used, so we can tell it.
Refactored the process execution results to correctly utilize our
namedtuplesvariant, that makes it easier to understand what code does with the results.Quality: Enabled automatic conversion of em-dashes and en-dashes in code comments to the autoformat tool. AI won’t stop producing them and they can cause
SyntaxErrorfor older Python versions, nor is unnecessarily using UTF-8 welcome.Ensured that cloned outline nodes are assigned their correct names immediately upon creation, that avoids inconsistencies during their creation.
Quality: Updated to the latest versions of
blackand adopted a fasterisortexecution by caching results.Quality: Modified the PyLint wrapper to exit gracefully instead of raising an error when no matching files require checking.
Quality: Avoided checking YAML package configuration files twice, since autoformat already handles them.
Quality: Ensured that YAML package configuration checks output the original filename instead of the temporary one when a failure occurs.
Quality: Prevented pushing of tags from triggering git pre-push quality checks.
Quality: Silenced the output of
optipngandjpegoptimduring image optimization auto-formatting.Visual Code: Added the generated Python alias path file to the ignore list.
Quality: Enabled auto-formatting for the Nuitka devcontainer configuration file.
Watch: Avoided absolute paths in compilation to make reports more comparable across machines.
Quality: Changed
mdformatchecks to run only once and silently.Scons: Disabled format security errors in debug mode and moved Python-related warning disables into common build setup code.
Quality: Updated to the latest
deepdiffversion.Scons: Avoided MSVC telemetry since it can produce outputs that break CI.
Debugging: Enhanced non-deployment handler for importing excluded modules.
Split import module finding functionality into more pieces for enhanced readability.
Debugging: Added more assertions for constants loading and checking.
macOS: Dropped the
universaltarget arch.Debugging: Added more traces for deep hash verification.
Summary
This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.
The --project option seems usable now.
Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.
PyCoder’s Weekly
Issue #735: Agentic Architecture, Python is Weird, 3.15, and More (2026-05-19)
#735 – MAY 19, 2026
View in Browser »
Agentic Architecture: Why Files Aren’t Always Enough
What are the limitations of using a file-based agent workflow? Why do massive context windows tend to collapse? This week on the show, Mikiko Bazeley from MongoDB joins us to discuss agentic architecture and context engineering.
REAL PYTHON podcast
Python Is Weird
Here is a collection of things that surprised Maciej about Python. Some you might know and some that might surprise you too.
MACIEJ KOWALSKI
Harness Orchestration: The Next Primitive for AI Agents
A Python SDK that lets you compose Claude Code, Codex, and Gemini as one autonomous harness - agents become FastAPI-style routes you can wire, version, and deploy. Open source. Fork SWE-AF (a 100+ agent software factory) or our cloud-security harness as starter kits. Clone a Recipe →
AGENTFIELD sponsor
Python 3.15: Features That Didn’t Make the Headlines
Every release there are changes that don’t make the headlines, here are a few in the upcoming Python 3.15 release
CHANGS.CO.UK • Shared by Jamie Chang
DjangoCon US 2026 Tickets Available
DJANGOCON.US • Shared by Aayush Gauba
Articles & Tutorials
PyCon US 2026 Typing Summit Recap
Per-talk notes from the PyCon US 2026 Typing Summit. Includes info on: Pyrefly and AI agents, ty constraint sets, Lean formalization, tensor shape types, intersection types, PEP 827, Guido on the direction of typing, and the Typing Council Q&A.
BERNÁT GÁBOR
Event Sourcing Design Pattern
Talk Python interviews Chris May and they discuss the event sourcing design pattern: a mechanism for databases to work like git with immutable, replayable events. Learn what libraries help you do this in Python and when to use the pattern.
TALK PYTHON podcast
Strategic Planning at the PSF
The Python Software Foundation Board has been developing a strategic plan to guide the foundation’s direction over the next five years. This post describes the process and future goals.
PYTHON SOFTWARE FOUNDATION
How Python’s GIL Actually Works (And When It Bites You)
This post explains how Python’s GIL limits the amount of concurrency you can get through threading alone, why it is there, and how it is changing as Python evolves.
ATHREYA AKA MANESHWAR
Concurrency: A Deep Dive Into Multithreading With Python
“This article explains concurrency in Python including topics like multithreading, multiprocessing, race conditions, and synchronization mechanisms such as locks.”
NIKOS VAGGALIS
Shipping Django as a Desktop App
This is a summary of Jochen Wersdörfer’s talk at DjangoCon EU where he outlined how his team used Electron to turn a Django project into an installable app.
REINOUT VAN REES
Pydantic Forks httpx
The Pydantic team has forked httpx and named it httpx2. The folks who created httpxyz have decided to let the larger organization take the reins.
MICHIEL BEIJEN
How to Flatten a List of Lists in Python
Learn how to flatten a list of lists in Python using for loops, list comprehensions, itertools, functools, NumPy, and recursion.
REAL PYTHON
Building Type-Safe LLM Agents With Pydantic AI
Build type-safe LLM agents in Python with Pydantic AI using structured outputs, function calling, and dependency injection.
REAL PYTHON course
Pyrefly v1.0 Is Here!
Pyrefly has reached stable version 1.0 status, read about the new features and how to get started.
PYREFLY.ORG
Projects & Code
Events
PyData Bristol Meetup
May 21, 2026
MEETUP.COM
PyLadies Dublin
May 21, 2026
PYLADIES.COM
Python Sheffield
May 26, 2026
GOOGLE.COM
PyCon Italia 2026
May 27 to May 31, 2026
PYCON.IT
Python Southwest Florida (PySWFL)
May 27, 2026
MEETUP.COM
Happy Pythoning!
This was PyCoder’s Weekly Issue #735.
View in Browser »
[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
Real Python
Tapping Into the Zen of Python
The Zen of Python is a collection of 19 aphorisms that capture the guiding principles behind Python’s design. You can display them anytime by running import this in a Python REPL. Tim Peters wrote them in 1999 as a joke, but they became an iconic part of Python culture that was even formalized as PEP 20.
By the end of this video course, you’ll understand:
- The Zen of Python is a humorous poem of 19 aphorisms describing Python’s design philosophy
- Running
import thisin a Python interpreter displays the complete text of the Zen of Python - Tim Peters wrote the Zen of Python in 1999 as a tongue-in-cheek comment on a mailing list
- The aphorisms are guidelines, not strict rules, and some intentionally contradict each other
- The principles promote readability, simplicity, and explicitness while acknowledging that practicality matters
Experienced Pythonistas often refer to the Zen of Python as a source of wisdom and guidance, especially when they want to settle an argument about certain design decisions in a piece of code. In this video course, you’ll explore the origins of the Zen of Python, learn how to interpret its mysterious aphorisms, and discover the Easter eggs hidden within it.
You don’t need to be a Python master to understand the Zen of Python! But you do need to answer an important question: What exactly is the Zen of Python?
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Absolute vs Relative Imports in Python
In this quiz, you’ll test your understanding of Absolute vs Relative Imports in Python.
By working through this quiz, you’ll revisit how Python’s import system resolves modules, the differences between absolute and relative imports, and the PEP 8 conventions for styling import statements.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Tapping Into the Zen of Python
In this quiz, you’ll test your understanding of Tapping Into the Zen of Python.
By working through this quiz, you’ll revisit the origins of the poem, the meaning of several aphorisms, and the inside jokes hidden throughout.
The questions explore how the principles apply in practice and when it’s okay to bend the rules in the name of practicality.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
PyCharm
LLM Evaluation and AI Observability for Agent Monitoring
This is a guest post from Naa Ashiorkor, a data scientist and tech community builder.
Artificial intelligence keeps evolving at a rapid pace. The latest major application of AI, specifically of LLMs, is AI agents. These are systems that use their perception of their environment, processes, and input to take action to achieve specific goals, and they are built on LLMs.
Increasingly, complex AI agents are being used in real-world applications. While simpler agentic applications that use only one agent to achieve a goal still exist, organizations are now shifting towards multi-agent systems that use multiple subagents coordinated by a main agent. These are more adaptable and can mimic human teams when it comes to performing specialized tasks such as data analysis, compliance, customer support, and more. The reasoning and autonomy of AI agents have improved; consequently, they can gather data, conduct cross-references, and generate analysis.
As we move towards these complex, real-world applications of agents, an ever-stronger spotlight is being shone both on how we observe AI agents and how we evaluate the LLMs they’re built upon. The complexity, interactions, and autonomous processes under the surface of AI agents make rigorous monitoring and assessment an essential part of building and maintaining these applications. LLM evaluation determines if the AI agent can work, while AI agent observability determines if it is working. LLM evaluation tests an agent’s basic capabilities before and during deployment, while agent observability provides deep, real-time visibility into an agent’s internal reasoning and operational health once it is live. It is pretty obvious that having just one of these is a loss and a formula for failure.
In this blog post, we’ll explore how to evaluate agents using advanced metrics and observability tools. It’s designed as a practical, end-to-end reference for teams that want to move beyond demos and actually run AI agents in live, real-world environments, avoiding the common pitfalls that cause failure in production.
Core LLM evaluation metrics for modern AI systems
As LLMs are now applied to a wide range of use cases, it is important that their evaluation covers both the tasks they may perform and their potential risks. Evaluation metrics give a better understanding of the strengths and weaknesses of LLMs, influence the guidance of human-LLM interactions, and highlight the importance of ensuring LLM safety and reliability. Hence, LLM evaluation metrics for assessing the performance of an LLM are indispensable in modern AI systems. Without well-defined evaluation metrics, assessing model quality becomes subjective.
There are several key evaluation metrics, each with a different purpose, and the table below provides a summary of some of them.
| Evaluation Metric | What the metric evaluates |
| Hallucination rate | Factual accuracy and truthfulness of generated content |
| Toxicity scores | Harmful, offensive, or inappropriate content |
| RAGAS (Retrieval Augmented Generation Assessment) | Measures whether the RAG system retrieves the right documents and generates answers that are faithful to those sources |
| DeepEval | Tests everything from basic accuracy and safety to complex agent behaviors and security vulnerabilities across the entire LLM application |
Hallucination rate
Hallucinations in LLMs produce outputs that seem convincing yet are factually unsupported and can be categorized as either intrinsic, where the output contradicts the source content, or extrinsic, where it simply cannot be verified. They can stem from a range of factors across data, training, and inference, from quality issues in the large datasets used for initial training and the data used to fine-tune model behavior to post-training techniques that make models overly eager to provide responses to imperfect decoding strategies at inference. Because hallucination is an unsolved challenge cutting across every stage of model development, measuring and assessing it remains a vital part of LLM evaluation.
There is a wide variety of techniques for detecting hallucinations. These include:
- Fact-checking: Extracting independent factual statements from the model’s outputs (fact extraction) and then verifying these against trusted knowledge sources (fact verification).
- Uncertainty estimation: Using the certainty provided in the model’s internal state to estimate how likely a piece of factual content is to be a hallucination.
- Faithfulness hallucination detection: Ensures the faithfulness of LLMs to provide context or user instructions.
There are several metrics for hallucination detection. Some of the most commonly used metrics include:
- Fact-based metrics: Assessing faithfulness by measuring the overlap of facts between the generated content and the source content.
- Classifier-based metrics: Utilizing trained classifiers to distinguish between the level of entailment between the generated content and the source content.
- QA-based metrics: Using question-answering systems to validate the consistency of information between the source content and the generated content.
- Uncertainty-based metrics: Assessing faithfulness by measuring the model’s confidence in its generated outputs.
- LLM-based metrics: Using LLMs as evaluators to assess the faithfulness of generated content through specific prompting strategies.
PyCharm’s Hugging Face integration lets you discover evaluation models and datasets without leaving the IDE. Use the Insert HF Model feature to search for hallucination or toxicity classifiers, and hover over any model or dataset name in your code to instantly preview its model card, including training data, intended use, and limitations. This means you can import a dataset, evaluate your LLM, and verify the tools you’re using, all from one place.
Opening the Hugging Face model browser in PyCharm from the Code menu, then selecting Insert HF Model.
Searching for a specific hallucination model and selecting one. Use Model inserts a ready-to-use code snippet into the editor.
A ready-to-use code snippet of the Vectara hallucination evaluation model is inserted into the editor.
Hovering over the Vectara hallucination evaluation model in the code to preview its model card within PyCharm.
Trust is imperative in the acceptance and adoption of technology. Trust in AI is especially important in areas such as healthcare, finance, personal assistance, autonomous vehicles, and others. Hallucinations have a huge impact on users’ trust in LLMs.
In 2023, a story went viral about a Manhattan lawyer who submitted a legal brief largely generated by ChatGPT. The judge quickly noticed how different it was from a human-written submission, revealing clear signs of hallucination. Incidents like this highlight the real-world risks of LLM errors and their impact on user trust. As people encounter more examples of hallucination, skepticism around LLM reliability continues to grow.
Toxicity scores
LLMs that have been pretrained on large datasets from the web have the tendency to generate harmful, offensive, and disrespectful content as well as toxic language, such as hate speech, harassment, threats, and biased language, which have a negative impact on their safe deployment. Toxicity detection is the process of identifying and flagging toxic content by integrating open-source tools or APIs into the LLM workflow to analyze both the user input and the LLM output. Some of the available toxicity tools include the OpenAI Moderation API, which is free, works with any text, and has a quick implementation. Perspective API by Google is also widely used with a transparent methodology, but will no longer be in service after 2026. Detoxify, which is open source, has no API costs, and is Python-friendly, and Azure AI Content Safety by Microsoft, which is customizable and best for enterprise deployments and existing Azure users. Hugging Face Toxicity Models have many model options and easy integration with Transformers.
Toxicity detection has become a guardrail; hence, it is important in public-facing applications. They prevent toxic content from reaching users, which protects both individuals and organizations. In public-facing applications, toxicity detection operates by input filtering, output monitoring, and real-time scoring. This prevents attacks where users intentionally train AI to produce toxic content through coordinated toxic inputs; toxic content will never reach the user, even if produced by the underlying AI, so systems can adjust their behavior dynamically based on conversation content and escalating risks. Unguarded AI can be exploited, which leads to reputational damage.
For toxicity evaluation, PyCharm’s Hugging Face Insert HF Model feature helps you discover classifiers like s-nlp/roberta_toxicity_classifier directly in the IDE. Hovering over the model name reveals its model card, where you can see it was trained on the Jigsaw toxic comment datasets, helping you understand what the model can and can’t detect before you write a single line of evaluation code.
Opening the Hugging Face model browser in PyCharm from the Code menu, then selecting the Insert HF Model.
Searching for a specific toxicity model and selecting one. Use Model inserts a ready-to-use code snippet into the editor.
A ready-to-use code snippet of the roberta_toxicity_classifier is inserted into the editor.
Hovering over the roberta_toxicity_classifier in the code to preview its model card within PyCharm.
Frameworks for LLM evaluation
Frameworks for LLM evaluation have changed the game; teams don’t have to rely on manual reviews, gut instinct, and subjective judgment to assess model quality. These frameworks automate the measurement of model quality using standardized, quantifiable metrics. They assign numerical scores to outputs that measure faithfulness, relevancy, toxicity, and other important dimensions. This automation results in reproducibility, speed, and objectivity.
Consequently, the same input always produces the same score; evaluation runs 10–100 times faster, so in minutes instead of days; and there are no more debates on the quality of the output. Some of these frameworks include DeepEval and Retrieval Augmented Generation Assessment (Ragas). DeepEval is an open-source evaluation framework built with seven principles in mind, such as the ability to easily “unit test” LLM outputs in a similar way to Pytest and plug in and use over 50 LLM-evaluated metrics, most of which are backed by research and all of which are multimodal.
It is extremely easy to build and iterate on LLM applications with two modes of evaluation, namely, end-to-end LLM evals and component-level LLM evals. It is used for comprehensive testing across RAG, agents, and chatbots. Ragas is a framework for reference-free evaluation of RAG pipelines. There are several dimensions to consider, such as the ability of the retrieval system to identify relevant and focused context passages, as well as the capability of the LLM to exploit such passages in a faithful way; hence, it is challenging to evaluate RAG systems. Ragas provides a suite of metrics for evaluating these dimensions without relying on ground-truth human annotations.
The limits of static prompt evaluation
Traditional LLM evaluation methods are useful for single prompt-response pairs, measuring output quality, RAG systems with straightforward retrieval, and static evaluation with fixed inputs. But they are limited for multi-step agents because LLM evaluation focuses on the final output quality, not the decision-making process that produced it. Multi-step agents exhibit a different kind of complexity, as they chain multiple decisions.
Why traditional LLM evaluation isn’t enough for agents
Agents operate independently within complex workflows, and this independence can introduce challenges such as deviation from expected behavior, errors in production, and more failure points than in traditional software applications. Hence, an agent can perform well in testing but fail in production. Traditional LLM evaluations don’t have the capacity to test such use cases. Testing is usually done in a controlled environment with limited scenarios, but production involves real users, edge cases, unpredictable inputs, and scale. This means that agents can make decisions that are not seen in testing, and in production, tasks could be completed, though incorrectly, without generating an error signal. This is where advanced evaluation and monitoring practices come to the rescue! They provide the visibility and systematic measurement needed to deploy agents confidently, rather than relying on trial and error.
The complexity of agent behavior
Traditional LLM evaluation measures single prompt-response pairs: provide an input prompt, receive an output response, and measure quality through metrics such as accuracy, relevance, and faithfulness. Due to the complexity and non-deterministic, multi-step reasoning of AI agents, they cannot be reliably evaluated using traditional evaluation metrics.
Agent behavior is complex, and this complexity introduces challenges. Agents operate in dynamic environments where APIs might be down, databases change between queries, and the “right” answer depends on current conditions. They can use external tools and APIs to complete tasks, and may either use the wrong tool or use the right tool with the wrong parameters or input type. Their internal reasoning traces remain hidden unless they are logged explicitly, so it might be challenging to determine whether an agent was successful through logic or chance. An agent’s output could be perfectly correct despite poor internal decisions, or the entire task could fail despite correct step execution.
This is where observability tooling becomes essential. PyCharm’s AI Agents Debugger breaks open the black box of agentic systems, letting you trace LangGraph workflows and inspect each agent node’s inputs, outputs, and reasoning directly in the IDE, with zero extra code. Just install the plugin, run your agent, and the debugger automatically captures execution traces. Click the Graph button to visualize the full workflow, making it easy to spot where an agent chose the wrong tool, passed bad parameters, or succeeded by luck rather than logic.
To see this in action, I built a simple travel-planning agent using LangGraph in two steps: a research node that suggests summer destinations based on my preferences, and a plan node that picks the best option and builds a three-day itinerary. With the AI Agents Debugger, you can trace exactly what information flowed between these two steps – what the research node suggested and how the planner used those suggestions to build the final itinerary.
The AI Agents Debugger shows how the agent moves from initialization to the research stage, displaying the data passed in and out, and the LLM call used to generate the research results.
The AI Agents Debugger shows how the planning step processes inputs and produces outputs, using an LLM call to construct the final travel itinerary.
The Graph viewprovides a high-level overview of the agent’s workflow, mapping how it progresses from the initial step through research and planning to the final result.
Advanced agent evaluation metrics
The complexity of AI agents demands evaluation that goes beyond considering the final output quality, that is, measuring whether it is accurate, relevant, and grounded. Specialized agent evaluation assesses the complete decision-making process, including the planning logic, tool selection, parameter construction, reasoning coherence, and resource efficiency that led to the final output. Hence, the advanced agent evaluation metrics are designed to make such a process visible and measurable. Some of them are task completion rate, tool usage, reasoning quality, efficiency, and error handling.
Task completion rate
Task completion rate measures the percentage of tasks where an agent successfully achieves the end goal. This is calculated as the number of completed tasks divided by the total number of tasks attempted. The context of “completed” differs by use case. There are real-world use cases for task completion rate. Let’s start with a basic use case. Consider a customer service agent handling a specific food delivery order: “Where is my order #0001? It has not been delivered to me.” Completion rate means successfully looking up the order ID, retrieving the tracking information, and providing an accurate delivery estimate, so all three steps must succeed. If the agent retrieves the wrong order or fails to assess the tracking system, that is a failed task, even if it produces the same output.
Next, let us look at a medium-complexity use case, sequential API calls. Consider an agent tasked with creating a Jira support ticket and notifying the relevant team in Slack. The agent calls the Jira API to create a ticket, parses the response to get the ticket ID, calls the Slack API with the ticket link, and finally verifies the success of both. If the agent successfully creates the Jira ticket, but the Slack notification fails, that is considered a failed task even if the ticket exists in Jira, since the team wasn’t notified.
Finally, let’s examine a high-complexity use case: An agent is given the task of completing an online purchase, which means it must handle everything from checkout to order confirmation. Six steps are involved: Verify the item is still in stock, process the payment with a credit or debit card, reserve or decrement inventory, create an order record, generate an order confirmation number, and send a confirmation email to the customer. If the agent successfully charges the customer’s card but the confirmation email fails to send, that’s a failed task, even if the payment was processed and the order was created. In such a situation, the customer has no proof of purchase, so they will likely contact support or attempt to purchase again.
Tool usage correctness
Tool usage correctness assesses whether an agent correctly identifies and invokes the relevant tools and APIs. It is a deterministic measure that is assessed using techniques such as LLM as a judge, like most LLM evaluation metrics. It has three dimensions:
- Did the agent choose the right tool for the task (tool selection)?
- Were the parameters constructed correctly (input parameters)?
- Did the agent properly use the tool results (output handling)?
Hence, it is important for reliability and functional correctness.
Step-by-step reasoning accuracy
In real-world use cases, an LLM agent’s reasoning is shaped by much more than just the model itself. Modern frameworks such as LangChain expose the agent’s internal “thoughts” through structured logging of intermediate reasoning steps. This is done using the ReAct (Reasoning and Acting) pattern, which involves the agent thinking about what to do, using a tool, observing the tool result, and then repeating until the task is complete. Each “thought” is logged as text, which creates a complete trace of the reasoning process from initial query to final answer. These traces can be extracted programmatically and evaluated to assess whether the agent’s logic is sound even when the final output appears correct. Evaluating planning steps involves assessing aspects such as the overall approach’s logic, the ordering of steps, and whether any steps are unnecessary or redundant. Evaluating execution assesses whether the implementation worked, such as whether tools were called with correct parameters, whether each step was completed successfully, whether errors were handled appropriately, and whether the output was interpreted correctly. This can be done seamlessly in PyCharm using the AI Agents Debugger.
Groundedness (faithfulness)
Groundedness, also known as faithfulness, is the most critical metric for retrieval-augmented generation (RAG), which is a common component of agentic applications. It assesses whether the agent’s response is actually supported by the retrieved source documents or whether, instead, the model hallucinated information. Different evaluation techniques include:
- Atomic claim verification: Breaks up the response into atomic claims and checks each claim against the retrieved context. It is slow but best for production RAG and thorough evaluation.
- Semantic similarity: Compares the embeddings of the response and source documents. It is fast, so it is best for quick checks and first-pass filtering.
- LLM-as-Judge: works by prompting the LLM to score groundedness by extracting factual statements from the response and then checking each statement against the retrieved context. It offers medium speed and is best for flexible, custom criteria.
AI observability and why it matters
AI observability is about visibility into what the agent is doing. This covers recording everything that happens when a task is executed, including the agent’s reasoning at each step, which tools were called with what parameters, what data was retrieved, and how decisions were made from start to finish. With such a transparent system where every decision can be logged and traced, teams are able to understand why an agent fails, behaves unexpectedly, or becomes expensive to run because issues can be debugged and behavior can be audited. Consequently, system design improves, and guesswork is eliminated.
Definition of AI observability
AI observability is the real-time monitoring of agent actions, thoughts, and environmental interactions: what went in, what came out, how the agent thought through the problem, and which tools, APIs, and data were used. AI observability builds on the three pillars of DevOps observability – that is, metrics, logs, and traces – but extends each one for AI’s unique needs. DevOps metrics track CPU and latency, while AI metrics track token usage and cost per interaction. DevOps logs capture system errors, while AI logs capture reasoning traces and decision points. DevOps traces follow requests through services, while AI traces follow reasoning through agent steps, tool calls, and observations.
Benefits for agent monitoring
Agent monitoring has immense benefits – here are some of the most important:
- It debugs reasoning errors: When an agent fails or gives an unexpected output, monitoring provides a complete trace of its decision-making process, which shows exactly where the logic broke down. Hence, there is no need to spend hours guessing the causes.
- It measures performance and latency over time: Since metrics such as average latency, token usage, cost per interaction, and completion rates across all queries are tracked, degradation patterns can be identified before they affect users. As a result, performance issues can be identified and resolved before users file any complaints.
- It identifies regressions after model or prompt updates: Baseline metrics such as completion rate, faithfulness scores, latency, and cost are established and then monitored for deviations after deployments. If a new prompt drops the compilation rate or a model update increases the hallucination rate, automated alerts catch it immediately. Hence, issues are caught before users are affected.
Popular tools for agent monitoring
Several frameworks and platforms have emerged to provide built-in observability for AI agents, with each having different strengths and integration approaches and matching different features and requirements. The choice of the right tool depends on the framework, deployment preferences, and primary needs. The table below shows some popular tools and whether they match different features and requirements.
| Tool | Traces agent steps? | Tracks costs? | Detects regressions? | Self-hostable? | Open source? | Easy integration? |
| Helicone | Yes | Yes | Yes | Yes | Yes | Yes |
| LangSmith | Yes | Yes | Yes | Limited | No | Yes |
| LangFuse | Yes | Yes | Yes | Yes | Yes | Moderate |
| OpenLLMetry | Yes | Limited | Limited | Yes | Yes | Moderate |
| Phoenix | Yes | Limited | Yes | Yes | Yes | Moderate |
| TruLens | Yes | Limited | Yes | Yes | Yes | Moderate |
| DataDog | Limited | Yes | Yes | No | No | Moderate |
Best practices for evaluating agents in production
Evaluation does not end after deployment; rather, it is intensified. This continuous evaluation tracks how much the system costs to run, how quickly it responds under various loads, and how it handles errors or unusual inputs. Without such evaluation, problems can only be identified after the users are affected. An agent can pass all the quality checks with excellent faithfulness scores, high completion rates, and strong reasoning but fail in production if costs spiral, latency increases, or edge cases cause instability. Hence, there is a critical need for ongoing evaluation and monitoring, which will lead to systems that are reliable, scalable, and financially sustainable.
Monitor cost and latency
Monitoring cost and latency is critical for production sustainability. Token usage and response time must be tracked continuously because small inefficiencies compound dramatically over time, and the cost per token of the powerful reasoning models used for agents can be high. Production workloads require cost and latency monitoring to identify problems before user experience and budget are impacted. Cost monitoring tracks token usage at different levels, such as per request, per query type, and over time. Without visibility into patterns generated by these, teams end up discovering cost problems through surprise bills. With monitoring, they can proactively cache common queries and optimize prompts to reduce token use. Latency monitoring reveals track response time and component breakdowns to identify bottlenecks.
Cost control in production workloads is important because production costs can spiral quickly, unmonitored systems can exceed budgets, and latency impacts user experience and retention.
Combine offline and online evaluation
Effective agent evaluation requires combining offline and online evaluation, where each addresses gaps the other leaves. Offline evaluation uses fixed test databases for reproducible benchmarking, which enables fast iteration on prompts and models in controlled environments without production risk. Online evaluation monitors real user interactions in production, which reveals edge cases in testing that were never expected, so it is useful for real-time feedback, user data, and observability tools. A combination of both results in an optimal strategy where offline evaluation validates changes before deployment, then online evaluation monitors production reality.
Use human-in-the-loop when necessary
LLM agents are appreciated for how they have played a positive role in the different ecosystems, but not every agent should run autonomously since they can misinterpret prompts, cross boundaries, or make dreadful errors that can’t be caught by automation alone. Hence, the need for human-in-the-loop failsafes. Human-in-the-loop is also essential during initial setup: Unless teams already have domain-specific evaluation datasets for monitoring the agent, these will need to be created manually by assessing the agent’s performance. A hybrid approach is required when critical decisions require human validation, such as approving transactions, modifying sensitive data, or triggering irreversible workflows. In this approach, it is important that decisions are routed through a human checkpoint before proceeding. The intention is not to slow automation but rather to ensure that the right decisions involve the right oversight. A well-designed human-in-the-loop system delivers compound returns over time. Every human correction becomes feedback, which improves the agent’s accuracy and gradually reduces the need for manual review. Human oversight isn’t treated as a failure but rather as a safety net that makes the system better with use.
Final thoughts
Fundamentally, AI agents are different from single-prompt LLMs. They navigate multi-step workflows, make autonomous decisions, and use external tools, which introduces complexities that demand continuous evaluation, not just static testing. Evaluation must evolve from pre-deployment checkpoints to ongoing monitoring. Production-ready agents aren’t just well-tested; they’re continuously observed and improved based on real behavior. LLM evaluation and AI observability enable faster, safer iteration by catching issues early and feeding production insights back into development.
PyCharm streamlines agent development with integrated debugging, profiling, and testing. Step through reasoning with breakpoints, find cost bottlenecks, and iterate on evaluation tests rapidly. These workflows transform hours of debugging into minutes of systematic investigation. Explore PyCharm for AI development to see how integrated tools can help you build, evaluate, and deploy reliable AI agents.
About the author
May 18, 2026
Ari Lamstein
How Remote Work Has Grown — and Shrunk — Since Covid
Remote work surged during Covid — and while it has declined since, it’s still far above pre‑pandemic levels. I just updated my Covid Demographics Explorer with the latest ACS data, and the national trend is striking:
Remote work more than tripled between 2019 and 2021, rising to nearly 28 million people at the height of the pandemic. Since then it has edged down each year, but only modestly. Even today, at about 22 million, it remains roughly 2.5 times the pre‑Covid level.
The app now lets you generate this same graph for every state, as well as for counties and cities with populations of at least 65,000. See how the trend looks where you live.
Exploring Local Trends
I also added a “Compare Years” tab that lets you see which locations saw the biggest change in remote work between any two years. The national trend tells one story, but the local data tells another: the rise and fall of remote work played out very unevenly across the country. Below I run this analysis twice: first for the national increase from 2019-2021, and then for the gradual decline between 2021 and 2024.
The Remote Work Spike: 2019-2021
Between 2019 and 2021, the location that increased the number of remote workers the most was Sunnyvale, California. The number of remote workers there increased almost 11x in two years, from an estimated 3,235 to 38,319. Sunnyvale is in the heart of Silicon Valley, and tech companies were among the fastest to adopt remote work, which helps explain this result:
The scatterplot also shows the broader pattern: most locations cluster between a 150% and 300% increase in remote work during this period. That makes Sunnyvale’s nearly 1,100% jump stand out even more — it’s an order of magnitude beyond the national norm.
Interestingly, only one location in the entire dataset saw a decrease in remote work during this period: Rice County, Minnesota (-7.5%). It’s the lone point below zero on the chart, and I don’t have a clear explanation for it.
The Remote Work Decline: 2021-2024
When we run this same analysis for 2021–2024, we see a very different result: Sunnyvale’s remote workforce shrank by 67.2%, the largest drop in the dataset. This means that Sunnyvale saw both the largest increase between 2019 and 2021 and the largest decrease between 2021 and 2024:
The scatterplot also shows how different the overall pattern is in this period. Instead of large increases, most locations cluster between a 10% and 30% decline in remote work — a sharp contrast with the 2019–2021 graph, where nearly every location saw a substantial increase.
Against this backdrop, Sunnyvale’s 67% drop stands out as an outlier. The likely explanation is the wave of return‑to‑office mandates that swept through the tech industry during this period. The two other largest decreases also happened in Silicon Valley: the city of Fremont (–61%) and Santa Clara County (–56%).
At the other end of the distribution, the few places that saw increases tend to be warm‑weather, high‑amenity destinations: Marion County, Florida (69%), Collier County, Florida (65%), and Maui County, Hawaii (57%) saw the largest gains. These increases may reflect people with remote‑work jobs relocating to places with natural beauty and a high quality of life — a very different dynamic from the employer‑driven declines we see in Silicon Valley.
Conclusion
Three years after the peak, roughly 22 million Americans still work from home — more than double the pre-pandemic baseline. But the story is more complex than a single national number: a dramatic surge, an uneven retreat, and striking differences across the country. How does your corner of the country fit in?
The new version of the Covid Demographics Explorer makes it easy to explore these patterns yourself. In addition to remote‑work trends, you can examine changes in population, median household income, median rent, and public assistance. Analyze your own location.
This app was built in Python with the Streamlit framework. I teach Streamlit for O’Reilly — and if you’d like to learn to build apps like this yourself, I offer a free 7-day email course. Sign up in the form below.
Real Python
Python Built-in Functions: A Complete Guide
Python’s built-in functions are predefined functions you can use anywhere in your code without any imports. They handle common tasks across math, data type creation, iterable processing, and input and output. Knowing which ones to reach for makes your code shorter and more Pythonic.
In this tutorial, you’ll:
- Recognize Python’s built-in functions and the built-in scope they live in
- Use the right built-in for math, data types, iterables, and I/O tasks
- Tell apart true functions and classes that look like functions
- Apply built-ins to solve practical problems without reinventing the wheel
To get the most out of this tutorial, you’ll need to be familiar with Python programming, including topics like working with built-in data types, functions, classes, decorators, scopes, and the import system.
Get Your Code: Click here to download the free sample code that shows you how to use Python’s built-in functions.
Get the PDF Guide: Click here to download a free PDF guide that gives you a complete overview of Python’s built-in functions and how to use them.
Take the Quiz: Test your knowledge with our interactive “Python Built-in Functions: A Complete Guide” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python Built-in Functions: A Complete GuideTest your understanding of Python's built-in functions for math, data types, iterables, and I/O—and when to reach for each one.
Built-in Functions in Python
Python has several functions available for you to use directly from anywhere in your code. These functions are known as built-in functions and they cover many common programming problems, from mathematical computations to Python-specific features.
Note: All these functions live in the builtins module, which Python loads at startup and exposes through the built-in scope, so you can use them anywhere without importing the module. Importing the module explicitly is useful if you know that you’ll shadow a built-in name with one of your own variables or functions. Doing so keeps the original within reach as builtins.name.
Among these built-ins, you’ll also find classes with function-style names like str, tuple, list, and dict, which define built-in data types. These classes are listed in the Python documentation as built-in functions, so they’re covered in this tutorial too.
In this tutorial, you’ll learn the basics of Python’s built-in functions. By the end, you’ll know what their use cases are and how they work. You’ll start with the built-in functions for math computations.
Using Math-Related Built-in Functions
In Python, you’ll find a few built-in functions that take care of common math operations, like computing the absolute value of a number, calculating powers, and more. Here’s a summary of the math-related built-in functions in Python:
| Function | Description |
|---|---|
abs() |
Calculates the absolute value of a number |
divmod() |
Computes the quotient and remainder of integer division |
max() |
Finds the largest of the given arguments or items in an iterable |
min() |
Finds the smallest of the given arguments or items in an iterable |
pow() |
Raises a number to a power |
round() |
Rounds a floating-point value |
sum() |
Sums the values in an iterable |
In the following sections, you’ll learn how these functions work and how to use them in your Python code.
Getting the Absolute Value of a Number: abs()
The absolute value or modulus of a real number is its non-negative value. In other words, the absolute value is the number without its sign. For example, the absolute value of -5 is 5, and the absolute value of 5 is also 5.
Note: To learn more about abs(), check out the How to Find an Absolute Value in Python tutorial.
Python’s built-in abs() function allows you to quickly compute the absolute value of a number. Here’s its signature:
abs(number)
The number argument can be any numeric value, including integers, floating-point numbers, complex numbers, fractions, and decimals. Take a look at a few examples:
>>> from decimal import Decimal
>>> from fractions import Fraction
>>> abs(-42)
42
>>> abs(42)
42
>>> abs(-42.42)
42.42
>>> abs(42.42)
42.42
>>> abs(complex("-2+3j"))
3.605551275463989
>>> abs(complex("2+3j"))
3.605551275463989
>>> abs(Fraction("-1/2"))
Fraction(1, 2)
>>> abs(Fraction("1/2"))
Fraction(1, 2)
>>> abs(Decimal("-0.5"))
Decimal('0.5')
>>> abs(Decimal("0.5"))
Decimal('0.5')
In these examples, you compute the absolute value of different numeric types using the abs() function. First, you use integer numbers, then floating-point and complex numbers, and finally, fractional and decimal numbers. In all cases, when you call the function with a negative value, the final result removes the sign.
For a practical example, say that you need to compute the total profits and losses of your company from a month’s transactions:
>>> transactions = [-200, 300, -100, 500]
>>> incomes = sum(income for income in transactions if income > 0)
>>> expenses = abs(
... sum(expense for expense in transactions if expense < 0)
... )
>>> print(f"Total incomes: ${incomes}")
Total incomes: $800
>>> print(f"Total expenses: ${expenses}")
Total expenses: $300
>>> print(f"Total profit: ${incomes - expenses}")
Total profit: $500
Read the full article at https://realpython.com/python-built-in-functions/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python Bytes
#480 Proud Parents
<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://www.better-simple.com/django/2026/05/06/using-django-tasks-in-production/?featured_on=pythonbytes">Using Django Tasks in production</a></strong></li> <li><strong>Co-authored with Claude?</strong></li> <li><strong><a href="https://rushter.com/blog/pypi-packages/?featured_on=pythonbytes">PyPI packages are increasing rapidly</a></strong></li> <li><strong><a href="https://tildeweb.nl/~michiel/httpx2.html?featured_on=pythonbytes">httpx2</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=-x1R3S72gCU' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="480">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a> <strong>Connect with the hosts</strong></li> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky) Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</li> </ul> <p><strong>Brian #1: <a href="https://www.better-simple.com/django/2026/05/06/using-django-tasks-in-production/?featured_on=pythonbytes">Using Django Tasks in production</a></strong></p> <ul> <li>Tim Schilling shares how the Djangonaut Space website has been using Django’s new tasks framework and some of the info missing from the official Django docs.</li> <li>Tasks require a third party package, <a href="https://github.com/RealOrangeOne/django-tasks-db?featured_on=pythonbytes"><code>django-tasks-db</code></a> to actually run the tasks.</li> <li>Article walks through all changes necessary to get an email process running to notify admins of new testimonials. Cool simple example.</li> <li>With the db backend, you can monitor progress of tasks in the admin, to see which tasks are scheduled, completed, or have errors.</li> <li>Some wishes for the community to implement <ul> <li>new tutorial in the Django docs</li> <li>Django Debug toolbar panel for tasks</li> <li>test/mock backend</li> </ul></li> <li>Great title for wish list: Thinks I’d like to see, but I’m too lazy to implement myself.</li> </ul> <p><strong>Michael #2: Co-authored with Claude?</strong></p> <ul> <li>Via Nik T.</li> <li>We don’t put “executed on macOS”, “edited with PyCharm”, etc. in our commits. Why Claude?</li> <li>Seems like a growth hack to me, that I don’t really care to participate in.</li> <li>Some projects that have formalized their thoughts on this: <a href="https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/?featured_on=pythonbytes">The Generative AI Policy Landscape in Open Source</a></li> <li>Adjust to turn off in <code>~/.claude/settings.json</code> see <a href="https://code.claude.com/docs/en/settings#attribution-settings">the docs</a>. <div class="codehilite"> <pre><span></span><code><span class="p">{</span> <span class="w"> </span><span class="nt">"attribution"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">"commit"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span> <span class="w"> </span><span class="nt">"pr"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span> <span class="w"> </span><span class="p">}</span> <span class="p">}</span> </code></pre> </div></li> </ul> <p><strong>Brian #3: <a href="https://rushter.com/blog/pypi-packages/?featured_on=pythonbytes">PyPI packages are increasing rapidly</a></strong></p> <ul> <li>Artem Golubin</li> <li>There’s been an increase of published packages per week on PyPI</li> <li>A pretty big increase in the last handful of months.</li> <li>30% increase since 2025, clearly due to AI</li> <li>Artem is building <a href="https://github.com/rushter/hexora?featured_on=pythonbytes">hexora</a>, a malicious Python code detector.</li> <li>Cool package too, it can: <ul> <li>Audit project dependencies to catch potential supply-chain attacks</li> <li>Detect malicious scripts found on platforms like Pastebin, GitHub, or open directories</li> <li>Analyze IoC files from past security incidents</li> <li>Audit new packages uploaded to PyPi.</li> </ul></li> <li>Artem is using hexora to analyze recently published pypi packages and many are obviously vibecoded and trigger false positives for abuses of <code>eval</code>, <code>exec</code>, and <code>subprocess</code> <ul> <li>Side note: I don’t think that’s necessarily a false positive. Not malicious, but maybe a stupid-code-detector?</li> </ul></li> <li>Lots are LLM related, Lots have bots contributing code</li> <li>Publishing rate is crazy, dozens to hundreds of published versions in a day is a bug, not a feature</li> <li>Brian’s proposal, PyPI should limit releases per day for any package to something a sane human would do, even if they make a mistake on a release, to maybe like 2-3, definitely under 10, in a day. And if the repo has obvious agent contributors listed, maybe lower to the limit to 1-2 a day? Honestly, “move fast and break things” doesn’t apply to breaking the commons.</li> </ul> <p><strong>Michael #4: <a href="https://tildeweb.nl/~michiel/httpx2.html?featured_on=pythonbytes">httpx2</a></strong></p> <ul> <li>More on the httpx, httpxyz, etc changes: Pydantic people started their own fork, <a href="https://github.com/pydantic/httpx2?featured_on=pythonbytes">httpx2</a>.</li> <li>Michiel says “while we think httpxyz was definitely needed, we welcome httpx2 and think it should be the ‘blessed’ fork.”</li> <li>Kludex, who is among other things maintainer of Starlette, was considering a fork</li> <li>As it stands, httpx2 is lacking the performance improvements they added to httpxyz. But it will not be long before they will add those, too.</li> <li>Also they already made some smart decisions: <ul> <li>they are switching from certifi to <a href="https://github.com/pydantic/httpx2/pull/209?featured_on=pythonbytes">truststore</a></li> <li>they are switching to <a href="https://github.com/pydantic/httpx2/pull/933?featured_on=pythonbytes">compression.zstd</a> on Python 3.14+, enabling zstd compression by default</li> <li>they <a href="https://github.com/pydantic/httpx2/commit/160c7f59d7942efe0133516c161d39139780eb45?featured_on=pythonbytes">merged httpcore</a> and vendored it in their repository</li> </ul></li> <li><a href="https://news.ycombinator.com/item?id=48127570&featured_on=pythonbytes">Discussion on Hacker News</a></li> </ul> <p><strong>Extras</strong></p> <p>Brian:</p> <ul> <li><a href="https://anarc.at/blog/2026-05-16-four-horsemen/?featured_on=pythonbytes">The Four Horsemen of the LLM Apocalypse</a> - Anarcat</li> <li><a href="https://www.djangoproject.com/weblog/2026/may/12/2026-django-developers-survey/?featured_on=pythonbytes">Django/JetBrains 2026 developer survey</a> is open</li> <li><a href="https://pyrefly.org/blog/v1.0/?featured_on=pythonbytes">Pyrefly 1.0</a> : “meaning we are confident that Pyrefly is ready for production use.” Michael:</li> <li>Just about ready to release Python Web Security: OWASP Top 10 with Agentic AI course. Be sure to be on <a href="https://training.talkpython.fm/getnotified?featured_on=pythonbytes">the courses newsletter</a> to get notified.</li> </ul> <p><strong>Joke:</strong> <a href="https://x.com/PR0GRAMMERHUM0R/status/1973145866962665752?featured_on=pythonbytes">Proud Parents</a></p>
Core Dispatch
Core Dispatch #4
Welcome back to Core Dispatch! This edition covers April 30 through May 18, 2026.
Python 3.15.0 beta 1 is officially here, which means CPython's main branch is
now open for 3.16 work. The first 3.16 alpha is slated for mid-October.
More imminently, beta 2 is up next on June 2, with 3.13.14 and 3.14.6
following on June 9.
This is also PyCon US week, so a lot of the core team is gathered in Long Beach right now. Once recordings are available, we'll be sure to pull talks from folks on the team into a future edition.
PEP 788 has also moved from accepted to implemented, and free-threaded builds
picked up thread-safe iterator support. There are also a few smaller but concrete fixes:
http.server can send custom headers from the command line, AttributeError
can suggest Python equivalents for method names from other languages,
webbrowser on macOS is moving away from osascript, and ftplib.ftpcp()
picked up the PASV CVE fix.
If you maintain a package or just like living on the edge, give the latest 3.15 beta a spin and file any issues you find.
Upcoming Releases
- Python 3.15.0 beta 2 — Jun 02
- Python 3.13.14 — Jun 09
- Python 3.14.6 — Jun 09
Official News
- Python 3.14.5 is out! — By Hugo van Kemenade
- Python 3.15.0 beta 1 is here! — By Hugo van Kemenade
- Python 3.14.5 release candidate — By Hugo van Kemenade
Merged PRs
- Add a
--headerCLI argument tohttp.server - Add free-threading support for iterators
- Replace
osascriptwithopenon macOS inwebbrowser - Add
Zd/Zfformats toarray,ctypes,memoryview,struct - Add cross-language method suggestions for builtin AttributeError
- Implement PEP 788
- Apply CVE-2021-4189 PASV fix to
ftplib.ftpcp()
Discussion
- PEP 661: Sentinel Values — 🔥 32 new replies · 34.7k views
- PEP 832: Virtual Environment Discovery — 9 new replies · 4.2k views
- PEP 802: Display Syntax for the Empty Set — 6 new replies · 10.4k views
- PEP 828: Supporting
yield fromin asynchronous generators — 5 new replies · 3.3k views
Upcoming CFPs & Conferences
- PyCon Italia 2026 — May 27
- Python Leiden User Group — May 28
- 📋 Swiss Python Summit 2026 Deadline — May 31
- 📋 PyCon Indonesia 2026 Deadline — May 31
- 📋 PyCon Togo 2026 Deadline — Jun 01
- 📋 PyCon Ghana 2026 Deadline — Jun 06
- GeoPython 2026 — Jun 08
- 📋 PyCon Kenya 2026 Deadline — Jun 09
- 📋 PyCon South Korea 2026 Deadline (extended) — Jun 14
- 📋 Python Ho 2026 Deadline — Jun 15
One More Thing
"It's oompa loompa shit"
— Pablo Galindo Salgado
Credits
May 17, 2026
Artem Golubin
PyPI packages are increasing rapidly
PyPI is the main repository for Python packages. One thing that I've noticed recently is the number of published packages per week.
Let's look at published counts of new package versions per week:

There are some dips in the data, but that's because of how the data was collected. We can see a clear increase in the number of published packages, especially in the last few months.
Because of AI, the number of packages published per week has increased by 30% since 2025.
I'm working on hexora, a library that detects malicious Python code in packages.[......]
May 16, 2026
PyCon
Welcome Back, NVIDIA: Visionary Sponsor of PyCon US 2026
NVIDIA is excited to once again support PyCon US 2026 as a Visionary Sponsor, and to sponsor the Future of AI with Python Conference Track.
Python is a “first-class” language at NVIDIA CUDA, and NVIDIA is committed to bringing our technology to Python developers in close alignment with C++ upon new releases of our hardware. We’re also happy to announce the general availability of CUDA Python 1.0.
NVIDIA’s commitment to Python goes well beyond just our own tech stack. NVIDIA’s Python engineers contribute across a broad swath of the Python ecosystem, from the core interpreter itself, to packaging and PyPI, to the Python community at large. NVIDIA is inspired by the energy of, and privileged to collaborate with, people across the open source Python community.
Since PyCon last year, NVIDIA Pythonistas – in collaboration with many others in the Python community – have made great progress on the evolution of various packaging standards, including working with community partners on the implementation of wheel variants and the establishment of a Packaging Council to better govern the evolution of packaging standards and PyPI. NVIDIA Python engineers are also engaged in implementation, testing, and porting work for the free-threaded build of the interpreter. NVIDIA Python engineers are driving the early exploratory work for adopting Rust for CPython, work on Python performance benchmarking, and are actively involved in many enhancements for Python 3.14 and 3.15, including providing built-in Zstandard support in Python 3.14.
At NVIDIA, we are excited to work with our partners and the open source Python community to help bring the best developer experience for users of high performance computing and AI. Come see NVIDIA at the Anaconda and PyTorch booths, and at the AI Track.
Barry Warsaw
May 2026
Principal System Software Engineer, NVIDIA
Python Core Developer since 1994
Python Steering Council member in 2026
May 15, 2026
Anarcat
The Four Horsemen of the LLM Apocalypse
I have been battling Large Language Models (LLM1) for the past couple of weeks and have struggled to think about what it means and how to deal with its fallout.
Because the fight has come from many fronts, I've come to articulate this in terms of the Four Horsemen of the Apocalypse.
Sound track: Metallica's The Four Horsemen, preferably downloaded from Napster around 2000, but now I guess you get it on YouTube.
War: bot armies
Let's start with War. We've been battling bot armies for control of our GitLab server for a while. Bots crawl virtually infinite endpoints on our Git repositories (as opposed to downloading an archive or shallow clone), including our fork of Firefox, Tor Browser, a massive repository.
At first, we've tried various methods: robots.txt, blocking user agents, and finally blocking entire networks. I wrote asncounter. It worked for a while.
But now, blocking entire networks doesn't work: they come back some other way, typically through shady proxy networks, which is kind of ironic considering we're essentially running the largest proxy network of the world.
Out of desperation, we've forced users to use cookies when visiting our site. We haven't deployed Anubis yet, as we worry that bots have broken Anubis anyways and that it does not really defend against a well-funded attacker, something which Pretix warned against in 2025 already.
(We have a whole discussion regarding those tools here.)
But even that, predictably, has failed. I suspect what we consider bots are now really agents. They run full web browsers, JavaScript included, so a feeble cookie is no match for the massive bot armies.
Side note on LLM "order of battle"
We often underestimate the size of that army. The cloud was huge even before LLMs, serving about two thirds of the web. Even larger swaths of clients like government and corporate databases have all moved to the cloud, in shared, but private infrastructure with massive spare capacity that is readily available to anyone who pays.
LLMs have made the problem worse by dramatically expanding the capacity of the "cloud". We now have data centers that defy imagination with millions of cores, petabytes of memory, exabytes of storage.
I thought that 25 gigabit residential internet in Switzerland could bring balance, but this is nothing compared to the scale of those data centers.
Those companies can launch thousands, if not millions of fully functional web browsers at our servers. Computing power or bandwidth are not a limitation for them, our primitive infrastructure is. No one but hyperscalers can deal with this kind of load, and I suspect that they are also struggling, as even Google is deploying extreme mechanisms in reCAPTCHA.
This is the largest attack on the internet since the Morris worm but while Robert Tappan Morris went to jail on a felony, LLM companies are celebrated as innovators and will soon be too big to fail.2
Which brings us to the second horsemen, famine.
Famine: shortages
All that computing power doesn't come out of thin air: it needs massive amounts of hardware, power, and cooling.
Earlier this year, I've heard from a colleague that their Dell supplier refused to even provide a quote before August. Dell!
In February, Western Digital's hard drive production for 2026 was already sold out. Hard drives essentially doubled in price within a year, and some have now tripled. A server quote we had in November has now quadrupled, going from 10 thousand to FORTY thousand dollars for a single server.
But regular folks are facing real-life shortages as well, as city-size data centers are being built at neck-breaking speed, stealing fresh water and energy from human beings to feed the war machine.
We've been scared of losing our jobs, but it seems that Apocalypse has yet to fully materialize. Regardless for engineers, the market feels tighter than it was a couple years ago, and everyone feels on edge that they will just have to learn to operate LLMs to keep their jobs.
Which brings us, of course, to Death.
Death: security and copyright
Our third horseman is one I did not expect a couple of months
ago. Back at FOSDEM, curl's maintainer Daniel Stenberg famously
complained about the poor quality of LLM-generated reports but
then, a few months later, everyone is scrambling to deal with floods
of good reports.
In the past two weeks, this culminated in a significant number of critical security issues across multiple projects. Chained together, remote code execution vulnerabilities in Nginx and Apache and two local privilege escalations in the Linux kernel (dirtyfrag and fragnesia) essentially gave anyone root access to any unpatched server to the web.
As I write this, another vulnerability dropped, which gives read access to any file to a local user, compromising TLS and SSH private keys.
All those vulnerabilities were released without any significant coordination while people scrambled to mitigate.
Many people including Linus Torvalds are now considering issues discovered through LLMs to be essentially public. This puts some debates about disclosure processes in perspective, to say the least.
But this is not merely the death of the traditional coordinated disclosure process, the C programming language, or the Linux kernel: remember that those bots are trained on a large corpus of copyrighted material. Facebook has trained their models on pirated books and Nvidia has done deals with Anna's Archive to secure access to large swaths of copyrighted material. The US Congress seems to think LLM outputs are not copyrightable, like any other machine outputs.
With many people now vibe coding their way out of learning or remembering how computers work, is this the Death of Copyright?
And that, of course, brings us to the final horseman: Pestilence.
Pestilence: slop
There is a growing meme that programming is essentially over as we know it. That you can simply vibe-code applications from scratch and it's pretty good.
Maybe that's true.
So far, most of my attempts at resolving any complex problem with a LLM have often failed with bizarre failures. Some worked surprisingly well. Maybe, of course, I am holding it wrong.
I personally don't believe LLMs will ever be good enough to produce and maintain software at scale. They're surprisingly good at finding security flaws right now. But what I see is also a lot of Bullshit, with a capital B. It's not lying: it does not "know" anything, so it can't lie. It's misleadingly cohesive and deliberate, but it lacks meaning, intent, will.
I have not been confronted with much slop, apart from the lobster Jesus or the yellow man atrocities, and particularly not in my work. But I see what it is doing to my profession: beyond vibe-coding, people are now token-maxxing, and land-grabbing their colleagues.
I don't like what LLMs do to our communities, or the fabric of software we live with.
Software does not evolve in a void. It is a team effort, be it free software or a corporate product. Generations of humans have carefully built the scaffolding of technology required for modern networks and software to operate, in a convoluted contraption that no single human fully understands anymore.
The idea of simply giving up on that understanding entirely and delegating it to an unproven model is not only chilling, it feels just plain stupid. Not stupid as in Skynet, stupid as in "I can't get inside the data center because the authentication system is down". Except we're in a "the power plant doesn't reboot" or "their LLM found an 0day in our slop" kind of stupid.
The fifth horsemen
Researching for this article, I looked up the four horsemen and found out they original seems to have been:
- Famine
- War
- Death
- Conquest (??)
I was surprised. I grew up thinking about the horsemen being Famine, War, Pestilence, and Death. So I went back to my original source which actually claims the horsemen are:
Time has taken its toll on you, the lines that crack your face.
Famine, your body, it has torn through, withered in every place.
Pestilence for what you've had to endure, and what you have put others through
Death, deliverance for you, for sure, now there's nothing you can do
So I guess that makes no sense either, which, fair enough, I shouldn't rely on Metallica for theological references. Especially since that song was originally called Mechanix and was "about having sex at a gas station".
Anyways.
The point is, there are actually five horsemen, and the fifth one is, in my opinion, Conquest.
Those companies (and not "AI", mind you) are taking over the world. I sense a strong connection with the "post-truth" world imposed on us by fascists like Trump and Putin. It's not an accident, it's a power grab part of the Californian Ideology3. Just like Airbnb broke housing, Uber destroyed the transportation and Amazon is taking over retail and server hosting, LLM companies are essentially trying to take over if not everything, at least Cognition as a whole.
But the capitalization of those companies (OpenAI and Nvidia in particular) are so far beyond reason that their inevitable collapse will likely lead to a global financial collapse of biblical proportions.
Because they will inevitably fail like previous bubbles they are built on. And when they fail, I hope it zips all the way back through the blockchain scam, the ad surveillance system, and the dot com then git me back my internet.
The Tower of Babel
While I'm off in the woods hallucinating (ha!) on biblical allegories, I feel there's another sign that the apocalypse is coming.
The Tower of Babel myth says that humans tried to create a big tower up to heaven and become god. God confounds their speech and scatters the human race. End of utopia.
This is what is happening to our human translators now. LLMs being, after all, Language Models, they are excellent at translation work. So much that the only translators not replaced by LLMs right now are interpreters, who translate vocally in real time. But interpreters are worried about their jobs as well.
This concretely means we will lose the human capacity, as a civilization, to translate between each other. It is still an open question whether the remaining revision work will be enough for translators to avoid deskilling, but other research has shown that LLM use leads to cognitive decline, impacts critical thinking, and generally, that deskilling is a common outcome.
Ultimately, I think this is where LLMs bring us. Towards collapse.
So this is a call to arms. Fight back!
Poison bots. Build local real-world communities.
Go low tech. Moore's law is dead, make use of it.
Patch your shit. Go weird.
Refuse slop. Train your brain.
The horsemen will collapse, but let's not go down with them.
This article was written without the use of a large language model and should not be used to train one.
- I prefer "LLM" to Artificial Intelligence, as I don't consider models to have "Intelligence" which goes far beyond the analytical traits we train models for. Intelligence requires embodiment and social interaction; machines lack the innate human skills of empathy, feeling and care, which explains a lot of the evils behind the current trends.↩
- It should be noted that Morris also happened to be one of the founder of Y Combinator where he is in good company with other techno-fascists like Peter Thiel, Sam Altman, and so on. Crime, after all, pays.↩
- Probably a good time to watch All Watched Over by Machines of Loving Grace.↩
PyCharm
Pyrefly LSP Integration with Type Engine in PyCharm 2026.1.2
In PyCharm 2026.1.2, you can enable Pyrefly as an external type provider, dramatically increasing the speed of the IDE’s code insight features.
What is the Pyrefly LSP?
“LSP” stands for the Language Server Protocol – a standardized protocol that allows code editors and IDEs to communicate with language servers. The LSP enables language servers to provide code intelligence features, such as:
- Code completion
- Information on hover (for example, quick documentation)
- Go to definition and other actions
- Error checking and type-related diagnostics
The key benefit of the LSP is that it allows a single language server to be used across multiple tools. This means that language-specific intelligence does not have to be implemented separately in every editor, IDE, or CI pipeline.
Pyrefly is Meta’s next-generation Python type checker, engineered from the ground up in Rust to replace its predecessor, Pyre (written in OCaml). With the move to Rust, Pyrefly achieves significantly faster performance and improved cross-platform portability. More than just a rewrite, it is designed to be more capable and robust, offering an efficient toolset for maintaining large-scale Python codebases with high precision and minimal overhead.
Pyrefly provides the following benefits:
- Higher performance and efficiency – Thanks to its Rust-based architecture, Pyrefly achieves significantly faster speeds and improves cross-platform portability.
- Enhanced code intelligence – As an external type provider, Pyrefly powers essential code insight features in the IDE, including type inference, type-related diagnostics, quick documentation, and inlay hints.
- Scalability – Pyrefly is designed to handle large-scale Python codebases with high precision and minimal overhead.
Pyrefly is highly beneficial for projects and developers dealing with large, complex Python codebases that prioritize performance and robust typing. Integrating Pyrefly via the LSP is part of our ongoing work to enhance code insight performance in PyCharm.
Using Pyrefly in PyCharm
Once enabled, Pyrefly powers all code insight functionality in PyCharm, including type inference and type-related diagnostics, quick documentation, and inlay hints. Delegating analysis to this faster engine delivers significantly improved performance.
To start using Pyrefly in your PyCharm project, go to the Type widget at the bottom of the window. By default, the IDE uses the built-in type engine. Click on the widget and select the option to use Pyrefly. If you do not have Pyrefly installed yet, PyCharm will install it automatically.
Once you’ve switched to the Pyrefly type engine, you will see a Pyrefly icon at the bottom, which you can hover over to check the version being used.
Please note that the integration currently works for local interpreter configurations. Support for Docker, Docker Compose, WSL, SSH, and multi-module projects is planned for future releases.
Pyrefly vs. the built-in type engine
Now let’s look at how Pyrefly and the built-in type engine behave in a complex Python project. In this FastAPI example, multiple files are typed, but in this file, the variable ref is incorrectly typed, causing four errors. When using the built-in type engine, the IDE identifies that something is wrong, but it suggests running further analysis to fix the problem, which requires an extra step.
Using Pyrefly as the type engine, the IDE reports errors immediately and highlights where they originate. However, it is worth noting that, in our example, there are four errors, but Pyrefly picks up only three of them. It misses the one in self._storage[ref].
Download the latest version of PyCharm and try it out
Ready to experience a dramatic leap in Python development performance? The Pyrefly type engine in PyCharm 2026.1.2 delivers the next generation of type checking. Engineered in Rust for unparalleled speed, it resolves files in as little as 0.5–1 seconds, significantly faster than the built-in engine. If you maintain large, complex Python codebases and prioritize robust typing, this feature is essential, as it allows you to delegate analysis to a faster engine and receive immediate type-related diagnostics. Download the latest version of PyCharm (2026.1.2) to unlock superior efficiency, scalability, and code insight.
Real Python
The Real Python Podcast – Episode #295: Agentic Architecture: Why Files Aren't Always Enough
What are the limitations of using a file-based agent workflow? Why do massive context windows tend to collapse? This week on the show, Mikiko Bazeley from MongoDB joins us to discuss agentic architecture and context engineering.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Python's Array: Working With Numeric Data Efficiently
In this quiz, you’ll test your understanding of Python’s Array: Working With Numeric Data Efficiently.
By working through this quiz, you’ll revisit the differences between Python’s array module and the built-in list, the meaning of type codes, how to create and manipulate arrays as mutable sequences, and the performance trade-offs of using a low-level numeric container.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
EuroPython
May Newsletter: Sessions, Speakers, Sprints
Hi all Pythonistas! 👋
Hope you’ve been enjoying these last few weeks, and hopefully planning your trip to Kraków in July! With two months left before the conference, the EuroPython organising team has been firing on all cylinders to create a conference to remember. Here’s the latest from us:
📋 Session and Speaker Lists Are Available
Our Programme Team is busy preparing a detailed schedule for you. We plan to release it in the upcoming days, but in the meantime we’ve got the list of sessions and speakers for you to check out. It’s going to be an exciting conference!
Lists of sessions and speakers are available at https://ep2026.europython.eu/👉 All conference sessions: https://ep2026.europython.eu/sessions/
👉 Speakers and tutorial leads: https://ep2026.europython.eu/speakers/
🗻 Language & Rust Summits
Summits are an opportunity for project contributors to come together during EuroPython. These are invite-only events with limited capacity at the venue, so registration is required.
🐍 Language Summit
The Python Language Summit is an event for the developers of Python implementations (CPython, PyPy, MicroPython, GraalPython, IronPython, and so on) to share information, discuss our shared problems, and — hopefully — solve them.
These issues might be related to the language itself, the standard library, the development process, the status of Python 3.15 (and plans for 3.16), the documentation, packaging, the website, and so forth. The Summit focuses on discussions and consensus-seeking, more than merely on presentations.
👉 Register for the Language Summit: https://ep2026.europython.eu/language-summit/
⚙️ Rust Summit
This full-day summit is dedicated to exploring the intersection of Rust and the Python ecosystem. Attendees can expect an intensive schedule focused specifically on integrating Rust into Python projects and the development of high-performance Python tools (e.g., using technologies like PyO3, Maturin, or writing performant native extensions).
This summit is designed for developers who already possess some practical experience in these topics and are looking to deepen their expertise, share lessons learned, and contribute to the community&aposs collective knowledge.
👉 Register for the Rust Summit: https://ep2026.europython.eu/session/rust-summit-at-europython
🗣️ Keynote Speakers
We are excited to announce a new keynote:
Leah Wasser will deliver a keynote at EuroPython 2026Leah Wasser is the Executive Director and founder of pyOpenSci, a community of 400+ researchers, engineers, and maintainers working to make developing and maintaining research software more accessible, sustainable, and human. She organizes the Maintainers Summit at PyCon US and believes the communities behind research software matter as much as the code itself.
Leah has built nationally recognized programs at the National Ecological Observatory Network (NEON) and the University of Colorado Boulder. Leah holds a PhD in ecology and is an active open source maintainer.
✋ Upcoming Call for Volunteers
We&aposre opening our Call for Volunteers next week! Want to be part of the team and help make EuroPython 2026 awesome? Keep an eye on the website, the signup form drops in just a few days. We&aposll be reviewing applications on a rolling basis, so don&apost wait – apply as soon as it goes live! Whether you&aposre a first-timer or a returning volunteer, we&aposd love to have you.
In my opinion, volunteering enriches the enjoyment of the whole event even further. There are many different roles to suit different personalities and abilities — one of them could suit you very well. Also, volunteering is about the team; you will not be left alone in any case.
Jake Balas, Onsite Volunteers Team Lead at EuroPython 2025 and this year’s Operations Team Lead
💙 Read our full interview with Jake https://blog.europython.eu/humans-of-ep-jake/
💰 Sponsorship: Diamond, Platinum, Silver Available
If you&aposre passionate about supporting EuroPython and helping make this conference accessible to a diverse, global Python community, consider becoming a sponsor or asking your employer to join us in this effort.
By sponsoring EuroPython, you’re not just backing an event – you&aposre gaining highly targeted visibility that will present your company or personal brand to one of the largest and most diverse Python communities in the world! Here’s what one of our sponsors said about their experience at EuroPython 2025:
The Apify team shares their experience sponsoring EuroPython 2025
We still have some Diamond, Platinum, and Silver slots available. Along with our main packages, there are optional add-ons and extras to craft your brand messaging in exactly the way that you need.
👉 More information at: https://ep2026.europython.eu/sponsorship/sponsor/
👉 Contact us at sponsoring@europython.eu
🚧 Speaker Orientation
Anyone interested in receiving speaker training from our experienced mentors is invited to an online workshop on the 3rd June 2026, at 18:00 CEST. We’ve designed the session for people of all experience levels, from first time speakers to seasoned presenters, and we still have spots for you.
👉 Register now to confirm your place: https://forms.gle/uZKwuAiBkUSmx7gn7
🤝 Community Partners
🇪🇸PyConES
Barcelona is calling, Pythonistas! PyConES 2026 has extended its CFP. New deadline: 17 May, 23:59 CEST. If you’re still thinking about submitting a talk, workshop, or idea to the community which will meet up in that gorgeous city, you have last days.
👉 Submit the proposal for PyConES 2026 https://pretalx.com/pycones-2026/cfp
🦬PyStok
PyStok #82 meetup lands on 20 May, 18:00 at Zmiana Klimatu in Białystok, Poland, and free registration is officially live. Grab your spot at https://pystok.org/najblizsze-wydarzenie to dive deep into RAG/LLM Wiki and the PLLuM (Polish Large Language Model) project. Between the "speed dating" networking, JetBrains giveaways and the legendary "Podlaskie afterparty", it’s the perfect spot to soak up those unique North-East Polish vibes and talk Python and AI with the local crowd.
📣 Community Outreach
🏖️PyCon US
Several members of the EuroPython Society have traveled across the ocean to join the biggest gathering of Pythonistas, which this year takes place in Long Beach, California. If you’re there this weekend, make sure to look up the EuroPython booth and say “hi” to the team!
🎁 Sponsor Spotlight
We&aposd like to thank Manychat for sponsoring EuroPython.
Manychat builds AI-powered chat automation for 1M+ creators and brands at real production scale.
View job openings at Manychat👋 Stay Connected
Follow us on social media and subscribe to our newsletter for all the updates:
👉 Sign up for the newsletter: https://blog.europython.eu/portal/signup
- LinkedIn: https://www.linkedin.com/company/europython/
- X/Twitter: https://x.com/europython
- Mastodon: https://fosstodon.org/@europython
- Bluesky: https://bsky.app/profile/europython.eu
- Instagram: https://www.instagram.com/europython/
- YouTube: https://www.youtube.com/@EuroPythonConference
We’ll be announcing more keynotes in the upcoming days, and the detailed schedule will be available soon, so you can plan your conference experience. Just eight weeks are left before we all meet in the City of Castles and Dragons. See you there! 🐍❤️
Cheers,
The EuroPython Team
Sign up for EuroPython Blog
The official blog of everything & anything EuroPython! EuroPython 2026 13-19 July, Kraków
No spam. Unsubscribe anytime.
May 14, 2026
Real Python
Quiz: Cursor vs Windsurf: Which AI Code Editor Is Best for Python?
In this quiz, you’ll test your understanding of Cursor vs Windsurf: Which AI Code Editor Is Best for Python?
By working through these questions, you’ll revisit how the two editors differ across code completion, agentic multi-file editing, and debugging.
You’ll also reconnect with the audit points worth applying whenever an AI agent writes Python on your behalf.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Python Metaclasses
In this quiz, you’ll test your understanding of Python Metaclasses.
Metaclasses sit behind every class you write in Python, and they’re one of the language’s deeper object-oriented concepts. By working through this quiz, you’ll revisit how classes are themselves objects, how type creates them, and how a custom metaclass lets you customize class creation.
You’ll also reflect on when a custom metaclass is actually the right tool and when a simpler technique does the job better.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python Engineering at Microsoft
PyCon US 2026
Come See Us at PyCon US 2026!
Microsoft and GitHub will be at PyCon US 2026, May 14–17 in Long Beach, CA. Stop by our booth, say hello, and tell us about your experience with our tools and services. We’d love to meet you.
Don’t miss the Meta booth on Saturday at 1 p.m., where we’ll be showing off the integration of Pylance with Meta’s new Pyrefly type checker. The integration is currently in early preview in our Insiders build, and we can’t wait to bring it to all our users later this year.
Hands-on Labs at the Booth
Drop in for 10-minute interactive labs covering:
- GitHub Copilot
- Azure DocumentDB
- Microsoft Foundry
- Microsoft Agent Framework
- Azure PostgreSQL
- Azure AI Search
Talks and Sessions
| Date & Time | Room | Session | Speaker |
|---|---|---|---|
| Wed, May 13 · 9:00 a.m.–12:30 p.m. | 101A | Build your first MCP server in Python | Pamela Fox |
| Wed, May 13 · 1:30 p.m.–2:30 p.m. | 201B | Dungeons and Databases: Build NPC agents to work with data in DocumentDB and Postgres (Microsoft Sponsor session) | Marko Hotti, Patty Chow |
| Thu, May 14 · 2:40 p.m.–3:05 p.m. | 104C | Education Summit: Big Lessons from Small Models, Teaching Python AI with SLMs | Gwyneth Peña-Siguenza |
| Thu, May 14 · 3:40 p.m.–4:05 p.m. | 104C | Education Summit: Your Slides, But Faster, Building an AI-powered presentation workflow | Pamela Fox |
| Fri, May 15 · 3:30 p.m.–4:00 p.m. | 104C | PyCharlas: Cómo pasé de perdida a enseñar Python + IA a miles, en un año | Gwyneth Peña-Siguenza |
| Sat, May 16 · 2:30 p.m.–3:45 p.m. | 201A | Maintainer Summit Tools Track: Dev Containers | Sarah Kaiser |
| Sun, May 17 · 1:00 p.m.–1:30 p.m. | Grand Ballroom A | A bridge over (not) troubled waters: Collecting marine data from your couch | Sarah Kaiser |
Can’t wait to see you there!
The post PyCon US 2026 appeared first on Microsoft for Python Developers Blog.
