skip to navigation
skip to content

Planet Python

Last update: May 20, 2026 04:44 PM UTC

May 20, 2026


death and gravity

reader 3.24 released – help, multi-user updates

Hi there!

I'm happy to announce version 3.24 of reader, a Python feed reader library.

What's new? #

Here are the highlights since reader 3.23.

Context-sensitive help #

In lieu of a tutorial mode, the web app now offers guidance to new users, and has a basic context-sensitive help system. Here's some screenshots:

new user / empty state new user / empty state
context-sensitive help context-sensitive help
also help also help

Structured logging #

reader now uses structured logging internally, through structlog.

By default, output goes to stdlib logging, but you can opt into structlog-native logging:

import reader, structlog
reader.enable_structlog()
structlog.configure(...)

This was relatively challenging to do, since as a library, you cannot configure logging, nor change any global state. I hope I can contribute a variant of the solution upstream, but meanwhile here's a recipe you can use in your library (warning: brittle code).

Make update_feeds() parallel again #

It turns out the "extensive rework of the parser internal API" from 3.15 caused update_feeds() to retrieve feeds in the main thread regardless of the worker count.

Protip

If you have a parallel map() that returns @contextmanagers, make sure the work you need to do in parallel doesn't happen in __enter__. 😅

New contributors #

Thank you to the new contributors that submitted pull requests to this release!

Want to contribute? Check out the docs and the roadmap.

Hosted reader status update #

As I said last time, I'm working on a hosted version of reader. Background: Why another feed reader web app?, Why not just self-host it?.

Multi-user feed updates #

One of the bigger changes for hosted reader was handling multi-user feed updates.

For intentional but questionable reasons, users have their own dedicated databases, with the web app routing to the appropriate one based on session information.

However, updating feeds should happen in a single, shared database; this allows:

This is now done, complete with a design document (to be published). As a teaser, here's a neat architecture / data flow diagram:

... user@2.sqlite user nginx Flask auth app auth.sqlite user@1.sqlite public shared.sqlite feeds public private email yes, it's web scale ಠ_ಠ

OK, so what now? #

Since I'm rapidly running out of technical things to do, a launch is imminent.

This is what is finished so far:

Remaining work to an MVP:

Meanwhile, if this sounds like something you'd like to use, get in touch.


That's it for now. For more details, see the full changelog.

Learned something new today? Share it with others, it really helps!

What is reader? #

reader takes care of the core functionality required by a feed reader, so you can focus on what makes yours different.

reader in action reader allows you to:

...all these with:

To find out more, check out the GitHub repo and the docs, or give the tutorial a try.

Why use a feed reader library? #

Have you been unhappy with existing feed readers and wanted to make your own, but:

Are you already working with feedparser, but:

... while still supporting all the feed types feedparser does?

If you answered yes to any of the above, reader can help.

The reader philosophy #

May 20, 2026 04:44 PM UTC


Real Python

How to Use the Claude API in Python

The fastest way to use the Claude API in Python is to install anthropic, set your API key, and call client.messages.create(). You’ll have a working response in under a minute:

How to Use the Claude API in Python for AI-Powered ApplicationsExample of Using the Claude API in Python

Claude is Anthropic’s large language model, accessible via a clean REST API with an official Python SDK. Unlike heavier AI frameworks that require you to wire up multiple components before you see any output, the anthropic package gets you to a working response in a handful of lines.

In the following steps, you’ll install the anthropic SDK, call Claude from Python, shape Claude’s behavior with a system prompt, and then return structured JSON output using a schema or Pydantic.

Note: Claude’s responses are non-deterministic, so the same prompt produces different output each time, which is expected for a large language model. Also, API calls cost money based on the number of tokens processed. Keep an eye on your usage in the Claude Console as you follow along.

Each step builds on the last, and the final script is short enough to read in one sitting but complete enough to extend into a real application of your own.

Get Your Code: Click here to download the free sample code that shows you how to use the Claude API in Python.

Take the Quiz: Test your knowledge with our interactive “How to Use the Claude API in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

How to Use the Claude API in Python

Test your understanding of using the Claude API in Python. Send prompts, set system instructions, and return structured JSON with a schema.

Prerequisites

Before diving in, make sure you have the following in place:

  • Python knowledge: You should be comfortable with Python basics, like defining functions, running scripts from the terminal, and working with virtual environments. If virtual environments are new to you, Python Virtual Environments: A Primer has you covered before you continue.

  • Python 3.9 or higher: The anthropic SDK requires Python 3.9 as a minimum. If you’re not sure which version you have, run python --version in your terminal. If you need to install or upgrade, follow the steps in the guide on installing Python.

  • An Anthropic account: You’ll need an Anthropic account to generate an API key in the Claude Console. Step 1 will show you how to find and secure your key once you’re in.

Don’t worry if you’ve never worked with an API before. This tutorial will walk you through authentication and help you make your first request from scratch.

Step 1: Set Up the Claude API in Python

Before you can call Claude from Python, you need an API key and the anthropic package installed. By the end of this step, you’ll have both, and Claude will be responding to your first prompt.

Get Your API Key and Install anthropic

Log in to the Claude Console or create a new account. If you’re starting fresh, you can begin using the API after adding $5 of credits.

Then navigate to the API Keys section. Click Create Key, give it a descriptive name like real-python-tutorial, and copy it immediately. You won’t see it again after you close the dialog.

Note: Never paste your API key directly into your code. Instead, store it as an environment variable. The anthropic SDK automatically reads it from ANTHROPIC_API_KEY at runtime, so you never need to reference it explicitly in your scripts.

Storing your key as an environment variable means it never touches your source code or version control history. The exact command depends on your operating system:

Language: PowerShell Script
PS> $env:ANTHROPIC_API_KEY="your-api-key-here"
Language: Shell
$ export ANTHROPIC_API_KEY="your-api-key-here"

With your API key stored safely, you’re ready to install the SDK. Create a fresh virtual environment and activate it before installing anything. This isolation prevents the anthropic package from conflicting with your system-level tools.

Language: PowerShell Script
PS> python -m venv venv
PS> venv\Scripts\activate
(venv) PS> python -m pip install anthropic
Language: Shell
$ python -m venv venv/
$ source venv/bin/activate
(venv) $ python -m pip install anthropic

Send Your First Prompt

Read the full article at https://realpython.com/claude-api-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 20, 2026 02:00 PM UTC

Quiz: How to Use the Claude API in Python

In this quiz, you’ll test your knowledge of How to Use the Claude API in Python.

By working through this quiz, you’ll revisit how to install the anthropic SDK, send prompts to Claude with client.messages.create(), shape responses with a system parameter, and return structured JSON output using a schema or Pydantic.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 20, 2026 12:00 PM UTC


Python GUIs

Adding QComboBox to a QTableView and getting/setting values after creation — Use QItemDelegate to embed combo boxes in your table views, with per-row data and value tracking

I'm using a QTableView to display data, and would like to limit the choices in some of the fields using a drop-down. I can use QComboBox to provide a list of choices in a normal UI, but how can I do that in a table view?

When you're working with QTableView in PyQt6, you'll sometimes want cells that offer a dropdown selection instead of plain text. A QComboBox is the natural fit here — but embedding one inside a table view takes a bit of wiring up.

In this tutorial, we'll walk through how to use a QItemDelegate to place a QComboBox into specific cells of a QTableView. We'll also cover how to populate each combo box with different items per row, and how to retrieve the selected value so you can use it elsewhere in your application.

How delegates work in Qt's Model/View framework

Qt's Model/View architecture separates your data (the model) from how it's displayed (the view). Between these two sits the delegate, which controls how individual cells are rendered and edited. When you want a cell to use a widget like a combo box instead of a plain text editor, you create a custom delegate.

The delegate has a few methods you'll override:

Let's build this up step by step.

Setting up the model and view

First, let's create a simple application with a QTableView and a QStandardItemModel. Each row will represent a software package, and one of the columns will hold a list of available versions. We'll store those version lists directly in the model data, so each row can have its own set of options.

python
import sys
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QTableView, QComboBox, QItemDelegate,
)
from PyQt6.QtGui import QStandardItemModel, QStandardItem
from PyQt6.QtCore import Qt, QItemDataRole


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("QComboBox in QTableView")

        self.table = QTableView()
        self.setCentralWidget(self.table)

        # Create a model with 3 rows and 2 columns.
        self.model = QStandardItemModel(3, 2)
        self.model.setHorizontalHeaderLabels(["Package", "Version"])

        # Each row has a package name and a list of available versions.
        packages = [
            ("Widget Library", ["1.0", "1.1", "2.0", "2.1"]),
            ("Data Toolkit", ["0.9", "1.0"]),
            ("Render Engine", ["3.0", "3.1", "3.2", "4.0"]),
        ]

        for row, (name, versions) in enumerate(packages):
            # Column 0: package name (plain text).
            self.model.setItem(row, 0, QStandardItem(name))

            # Column 1: store the version list in the item's data.
            # We use Qt.ItemDataRole.UserRole to keep the full list alongside the display text.
            item = QStandardItem(versions[-1])  # Display the latest version by default.
            item.setData(versions, Qt.ItemDataRole.UserRole)
            self.model.setItem(row, 1, item)

        self.table.setModel(self.model)

        # Apply our custom delegate to column 1.
        delegate = ComboDelegate(self.table)
        self.table.setItemDelegateForColumn(1, delegate)

        self.resize(400, 200)

Notice how we store the list of versions using Qt.ItemDataRole.UserRole. This is a custom data role — it lets us attach extra information to a model item without interfering with the text that's displayed (which uses Qt.ItemDataRole.DisplayRole). Each row gets its own version list, so when the combo box opens, it will show only the versions relevant to that row.

Creating the combo box delegate

Now let's write the ComboDelegate class. This is where the combo box gets created and connected to the model.

python
class ComboDelegate(QItemDelegate):
    """
    A delegate that places a QComboBox in cells of the assigned column.
    """

    def createEditor(self, parent, option, index):
        # Create the combo box and populate it with the version list for this row.
        combo = QComboBox(parent)
        versions = index.data(Qt.ItemDataRole.UserRole)
        if versions:
            combo.addItems(versions)
        return combo

    def setEditorData(self, editor, index):
        # Set the combo box to show the currently selected value.
        current_text = index.data(Qt.ItemDataRole.DisplayRole)
        idx = editor.findText(current_text)
        if idx >= 0:
            editor.setCurrentIndex(idx)

    def setModelData(self, editor, model, index):
        # Write the selected value back into the model.
        model.setData(index, editor.currentText(), Qt.ItemDataRole.DisplayRole)

    def updateEditorGeometry(self, editor, option, index):
        editor.setGeometry(option.rect)

Let's walk through each method:

createEditor() is called when the user double-clicks (or otherwise activates) a cell in column 1. We create a fresh QComboBox, pull the version list from Qt.ItemDataRole.UserRole for that specific row, and add those items to the combo box. Because each row stores its own list, different rows will show different options.

setEditorData() makes sure the combo box starts with the right item selected. We read the current display text from the model and find the matching entry in the combo box.

setModelData() fires when the user finishes editing (for example, by clicking away from the cell). It takes whatever the user selected in the combo box and writes it back into the model's DisplayRole.

updateEditorGeometry() simply ensures the combo box fills the cell neatly.

Running the application

Add the standard entry point at the bottom of your script:

python
app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec())

Run the script and double-click any cell in the "Version" column. You'll see a combo box appear with the version options for that specific row. Select a value, click away, and the cell updates.

QTableView with combo box delegates showing per-row version lists

Getting the selected value

After the user makes a selection, the value is stored in the model. You can read it at any time:

python
# Read the selected version for row 0.
selected = self.model.item(0, 1).text()
print(f"Row 0 selected version: {selected}")

If you want to react immediately when a selection changes, you can connect to the model's dataChanged signal. If you're new to how signals work in Qt, see our guide on signals, slots and events:

python
self.model.dataChanged.connect(self.on_data_changed)

def on_data_changed(self, top_left, bottom_right, roles):
    if top_left.column() == 1:
        row = top_left.row()
        value = top_left.data(Qt.ItemDataRole.DisplayRole)
        print(f"Row {row} version changed to: {value}")

This approach keeps things nicely separate — you're working through the model rather than trying to hold references to individual combo box widgets. The combo boxes are created and destroyed as the user interacts with cells.

Setting a value programmatically

To change a cell's value from code, update the model directly:

python
# Set row 2's version to "3.1".
self.model.item(2, 1).setText("3.1")

The next time the user opens the combo box on that row, the delegate's setEditorData() will position the combo box on "3.1".

You can also update the list of available versions for a row:

python
# Add a new version to row 1's options.
item = self.model.item(1, 1)
versions = item.data(Qt.ItemDataRole.UserRole)
versions.append("1.1")
item.setData(versions, Qt.ItemDataRole.UserRole)

Why each row gets its own combo box items

A common stumbling block is ending up with the same items in every combo box across the column. This happens when you store the item list on the delegate itself (as a single shared list) rather than on the model. Since the delegate is shared across all rows, any list stored on it will be the same everywhere.

The solution, as we've done here, is to store per-row data in the model using Qt.ItemDataRole.UserRole. Each call to createEditor() reads from the specific index it's given, so each row naturally gets its own set of options. This is a pattern you'll use often when different rows need different editor configurations.

Complete code

Here's the full working example in one block:

python
import sys
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QTableView, QComboBox, QItemDelegate,
)
from PyQt6.QtGui import QStandardItemModel, QStandardItem
from PyQt6.QtCore import Qt


class ComboDelegate(QItemDelegate):
    """
    A delegate that places a QComboBox in cells of the assigned column.
    """

    def createEditor(self, parent, option, index):
        combo = QComboBox(parent)
        versions = index.data(Qt.ItemDataRole.UserRole)
        if versions:
            combo.addItems(versions)
        return combo

    def setEditorData(self, editor, index):
        current_text = index.data(Qt.ItemDataRole.DisplayRole)
        idx = editor.findText(current_text)
        if idx >= 0:
            editor.setCurrentIndex(idx)

    def setModelData(self, editor, model, index):
        model.setData(index, editor.currentText(), Qt.ItemDataRole.DisplayRole)

    def updateEditorGeometry(self, editor, option, index):
        editor.setGeometry(option.rect)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("QComboBox in QTableView")

        self.table = QTableView()
        self.setCentralWidget(self.table)

        self.model = QStandardItemModel(3, 2)
        self.model.setHorizontalHeaderLabels(["Package", "Version"])

        packages = [
            ("Widget Library", ["1.0", "1.1", "2.0", "2.1"]),
            ("Data Toolkit", ["0.9", "1.0"]),
            ("Render Engine", ["3.0", "3.1", "3.2", "4.0"]),
        ]

        for row, (name, versions) in enumerate(packages):
            self.model.setItem(row, 0, QStandardItem(name))
            item = QStandardItem(versions[-1])
            item.setData(versions, Qt.ItemDataRole.UserRole)
            self.model.setItem(row, 1, item)

        self.table.setModel(self.model)

        delegate = ComboDelegate(self.table)
        self.table.setItemDelegateForColumn(1, delegate)

        # React to changes.
        self.model.dataChanged.connect(self.on_data_changed)

        self.resize(400, 200)

    def on_data_changed(self, top_left, bottom_right, roles):
        if top_left.column() == 1:
            row = top_left.row()
            value = top_left.data(Qt.ItemDataRole.DisplayRole)
            print(f"Row {row} version changed to: {value}")


app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec())

Wrapping up

Using a custom QItemDelegate gives you full control over how cells in a QTableView are edited. By storing per-row data in the model with Qt.ItemDataRole.UserRole, you can give each combo box its own set of items — solving the common problem of all combo boxes showing the same options.

The pattern here — store data in the model, read it in the delegate, write changes back to the model — works well beyond combo boxes. You can use the same approach to embed spin boxes, date pickers, or any other widget into your table cells. Once you're comfortable with this flow, you'll find Qt's Model/View framework surprisingly flexible. For a deeper dive into using QTableView with real-world data sources like NumPy and Pandas, see our QTableView with numpy and pandas tutorial. You can also explore how to make table cells editable for other common editing patterns.

For an in-depth guide to building Python GUIs with PyQt6 see my book, Create GUI Applications with Python & Qt6.

May 20, 2026 06:00 AM UTC

May 19, 2026


PyCoder’s Weekly

Issue #735: Agentic Architecture, Python is Weird, 3.15, and More (2026-05-19)

#735 – MAY 19, 2026
View in Browser »

The PyCoder’s Weekly Logo


Agentic Architecture: Why Files Aren’t Always Enough

What are the limitations of using a file-based agent workflow? Why do massive context windows tend to collapse? This week on the show, Mikiko Bazeley from MongoDB joins us to discuss agentic architecture and context engineering.
REAL PYTHON podcast

Python Is Weird

Here is a collection of things that surprised Maciej about Python. Some you might know and some that might surprise you too.
MACIEJ KOWALSKI

Harness Orchestration: The Next Primitive for AI Agents

alt

A Python SDK that lets you compose Claude Code, Codex, and Gemini as one autonomous harness - agents become FastAPI-style routes you can wire, version, and deploy. Open source. Fork SWE-AF (a 100+ agent software factory) or our cloud-security harness as starter kits. Clone a Recipe →
AGENTFIELD sponsor

Python 3.15: Features That Didn’t Make the Headlines

Every release there are changes that don’t make the headlines, here are a few in the upcoming Python 3.15 release
CHANGS.CO.UK • Shared by Jamie Chang

Python 3.15.0 Beta 1 Released

PYTHON.ORG

Python 3.14.5 Released

PYTHON.ORG

Announcing PSF Community Service Award Recipients

PYTHON SOFTWARE FOUNDATION

PEP 830: Add Timestamps to Exceptions and Tracebacks (Deferred to 3.16)

PYTHON.ORG

PEP 788: Protecting the C API From Interpreter Finalization (Final)

PYTHON.ORG

PEP 813: The Pretty Print Protocol (Deferred to 3.16)

PYTHON.ORG

2026 Django Developers Survey

DJANGO SOFTWARE FOUNDATION

DjangoCon US 2026 Tickets Available

DJANGOCON.US • Shared by Aayush Gauba

Articles & Tutorials

PyCon US 2026 Typing Summit Recap

Per-talk notes from the PyCon US 2026 Typing Summit. Includes info on: Pyrefly and AI agents, ty constraint sets, Lean formalization, tensor shape types, intersection types, PEP 827, Guido on the direction of typing, and the Typing Council Q&A.
BERNÁT GÁBOR

Event Sourcing Design Pattern

Talk Python interviews Chris May and they discuss the event sourcing design pattern: a mechanism for databases to work like git with immutable, replayable events. Learn what libraries help you do this in Python and when to use the pattern.
TALK PYTHON podcast

Strategic Planning at the PSF

The Python Software Foundation Board has been developing a strategic plan to guide the foundation’s direction over the next five years. This post describes the process and future goals.
PYTHON SOFTWARE FOUNDATION

How Python’s GIL Actually Works (And When It Bites You)

This post explains how Python’s GIL limits the amount of concurrency you can get through threading alone, why it is there, and how it is changing as Python evolves.
ATHREYA AKA MANESHWAR

Concurrency: A Deep Dive Into Multithreading With Python

“This article explains concurrency in Python including topics like multithreading, multiprocessing, race conditions, and synchronization mechanisms such as locks.”
NIKOS VAGGALIS

Shipping Django as a Desktop App

This is a summary of Jochen Wersdörfer’s talk at DjangoCon EU where he outlined how his team used Electron to turn a Django project into an installable app.
REINOUT VAN REES

Pydantic Forks httpx

The Pydantic team has forked httpx and named it httpx2. The folks who created httpxyz have decided to let the larger organization take the reins.
MICHIEL BEIJEN

How to Flatten a List of Lists in Python

Learn how to flatten a list of lists in Python using for loops, list comprehensions, itertools, functools, NumPy, and recursion.
REAL PYTHON

Quiz: How to Flatten a List of Lists in Python

REAL PYTHON

Building Type-Safe LLM Agents With Pydantic AI

Build type-safe LLM agents in Python with Pydantic AI using structured outputs, function calling, and dependency injection.
REAL PYTHON course

Pyrefly v1.0 Is Here!

Pyrefly has reached stable version 1.0 status, read about the new features and how to get started.
PYREFLY.ORG

Projects & Code

kubex: Python Asynchronous Client for Kubernetes

GITHUB.COM/CODEMAGEDDON

gh-profiler: Examine GitHub User’s Profile

GITHUB.COM/EHMATTHES

presidio: Detect, Redact, & Anonymize Sensitive Data (PII)

GITHUB.COM/MICROSOFT

fotomagoufis: CLI Photo Correction Tool

GITHUB.COM/DIMATOSJ

DiffSinger: Advanced Singing Voice Synthesis

GITHUB.COM/OPENVPI

Events

Weekly Real Python Office Hours Q&A (Virtual)

May 20, 2026
REALPYTHON.COM

PyData Bristol Meetup

May 21, 2026
MEETUP.COM

PyLadies Dublin

May 21, 2026
PYLADIES.COM

Python Sheffield

May 26, 2026
GOOGLE.COM

PyCon Italia 2026

May 27 to May 31, 2026
PYCON.IT

Python Southwest Florida (PySWFL)

May 27, 2026
MEETUP.COM


Happy Pythoning!
This was PyCoder’s Weekly Issue #735.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

May 19, 2026 07:30 PM UTC


Real Python

Tapping Into the Zen of Python

The Zen of Python is a collection of 19 aphorisms that capture the guiding principles behind Python’s design. You can display them anytime by running import this in a Python REPL. Tim Peters wrote them in 1999 as a joke, but they became an iconic part of Python culture that was even formalized as PEP 20.

By the end of this video course, you’ll understand:

Experienced Pythonistas often refer to the Zen of Python as a source of wisdom and guidance, especially when they want to settle an argument about certain design decisions in a piece of code. In this video course, you’ll explore the origins of the Zen of Python, learn how to interpret its mysterious aphorisms, and discover the Easter eggs hidden within it.

You don’t need to be a Python master to understand the Zen of Python! But you do need to answer an important question: What exactly is the Zen of Python?


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 19, 2026 02:00 PM UTC

Quiz: Absolute vs Relative Imports in Python

In this quiz, you’ll test your understanding of Absolute vs Relative Imports in Python.

By working through this quiz, you’ll revisit how Python’s import system resolves modules, the differences between absolute and relative imports, and the PEP 8 conventions for styling import statements.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 19, 2026 12:00 PM UTC

Quiz: Tapping Into the Zen of Python

In this quiz, you’ll test your understanding of Tapping Into the Zen of Python.

By working through this quiz, you’ll revisit the origins of the poem, the meaning of several aphorisms, and the inside jokes hidden throughout.

The questions explore how the principles apply in practice and when it’s okay to bend the rules in the name of practicality.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 19, 2026 12:00 PM UTC


PyCharm

LLM Evaluation and AI Observability for Agent Monitoring

This is a guest post from Naa Ashiorkor, a data scientist and tech community builder.

Artificial intelligence keeps evolving at a rapid pace. The latest major application of AI, specifically of LLMs, is AI agents. These are systems that use their perception of their environment, processes, and input to take action to achieve specific goals, and they are built on LLMs. 

Increasingly, complex AI agents are being used in real-world applications. While simpler agentic applications that use only one agent to achieve a goal still exist, organizations are now shifting towards multi-agent systems that use multiple subagents coordinated by a main agent. These are more adaptable and can mimic human teams when it comes to performing specialized tasks such as data analysis, compliance, customer support, and more. The reasoning and autonomy of AI agents have improved; consequently, they can gather data, conduct cross-references, and generate analysis.

As we move towards these complex, real-world applications of agents, an ever-stronger spotlight is being shone both on how we observe AI agents and how we evaluate the LLMs they’re built upon. The complexity, interactions, and autonomous processes under the surface of AI agents make rigorous monitoring and assessment an essential part of building and maintaining these applications. LLM evaluation determines if the AI agent can work, while AI agent observability determines if it is working. LLM evaluation tests an agent’s basic capabilities before and during deployment, while agent observability provides deep, real-time visibility into an agent’s internal reasoning and operational health once it is live. It is pretty obvious that having just one of these is a loss and a formula for failure. 

In this blog post, we’ll explore how to evaluate agents using advanced metrics and observability tools. It’s designed as a practical, end-to-end reference for teams that want to move beyond demos and actually run AI agents in live, real-world environments, avoiding the common pitfalls that cause failure in production.

Core LLM evaluation metrics for modern AI systems

As LLMs are now applied to a wide range of use cases, it is important that their evaluation covers both the tasks they may perform and their potential risks. Evaluation metrics give a better understanding of the strengths and weaknesses of LLMs, influence the guidance of human-LLM interactions, and highlight the importance of ensuring LLM safety and reliability. Hence, LLM evaluation metrics for assessing the performance of an LLM are indispensable in modern AI systems. Without well-defined evaluation metrics, assessing model quality becomes subjective. 

There are several key evaluation metrics, each with a different purpose, and the table below provides a summary of some of them.

Evaluation MetricWhat the metric evaluates
Hallucination rateFactual accuracy and truthfulness of generated content
Toxicity scoresHarmful, offensive, or inappropriate content
RAGAS (Retrieval Augmented Generation Assessment)Measures whether the RAG system retrieves the right documents and generates answers that are faithful to those sources
DeepEvalTests everything from basic accuracy and safety to complex agent behaviors and security vulnerabilities across the entire LLM application

Hallucination rate

Hallucinations in LLMs produce outputs that seem convincing yet are factually unsupported and can be categorized as either intrinsic, where the output contradicts the source content, or extrinsic, where it simply cannot be verified. They can stem from a range of factors across data, training, and inference, from quality issues in the large datasets used for initial training and the data used to fine-tune model behavior to post-training techniques that make models overly eager to provide responses to imperfect decoding strategies at inference. Because hallucination is an unsolved challenge cutting across every stage of model development, measuring and assessing it remains a vital part of LLM evaluation.

There is a wide variety of techniques for detecting hallucinations. These include: 

There are several metrics for hallucination detection. Some of the most commonly used metrics include:

PyCharm’s Hugging Face integration lets you discover evaluation models and datasets without leaving the IDE. Use the Insert HF Model feature to search for hallucination or toxicity classifiers, and hover over any model or dataset name in your code to instantly preview its model card, including training data, intended use, and limitations. This means you can import a dataset, evaluate your LLM, and verify the tools you’re using, all from one place.

PyCharm's Hugging Face integrationOpening the Hugging Face model browser in PyCharm from the Code menu, then selecting Insert HF Model. PyCharm's Searching for a specific hallucination model and selecting one. Use Model inserts a ready-to-use code snippet into the editor. PyCharm's A ready-to-use code snippet of the Vectara hallucination evaluation model is inserted into the editor. Vectara hallucination evaluation modelHovering over the Vectara hallucination evaluation model in the code to preview its model card within PyCharm.

Trust is imperative in the acceptance and adoption of technology. Trust in AI is especially important in areas such as healthcare, finance, personal assistance, autonomous vehicles, and others. Hallucinations have a huge impact on users’ trust in LLMs.

In 2023, a story went viral about a Manhattan lawyer who submitted a legal brief largely generated by ChatGPT. The judge quickly noticed how different it was from a human-written submission, revealing clear signs of hallucination. Incidents like this highlight the real-world risks of LLM errors and their impact on user trust. As people encounter more examples of hallucination, skepticism around LLM reliability continues to grow.

Toxicity scores

LLMs that have been pretrained on large datasets from the web have the tendency to generate harmful, offensive, and disrespectful content as well as toxic language, such as hate speech, harassment, threats, and biased language, which have a negative impact on their safe deployment. Toxicity detection is the process of identifying and flagging toxic content by integrating open-source tools or APIs into the LLM workflow to analyze both the user input and the LLM output. Some of the available toxicity tools include the OpenAI Moderation API, which is free, works with any text, and has a quick implementation. Perspective API by Google is also widely used with a transparent methodology, but will no longer be in service after 2026. Detoxify, which is open source, has no API costs, and is Python-friendly, and Azure AI Content Safety by Microsoft, which is customizable and best for enterprise deployments and existing Azure users. Hugging Face Toxicity Models have many model options and easy integration with Transformers.

Toxicity detection has become a guardrail; hence, it is important in public-facing applications. They prevent toxic content from reaching users, which protects both individuals and organizations. In public-facing applications, toxicity detection operates by input filtering, output monitoring, and real-time scoring. This prevents attacks where users intentionally train AI to produce toxic content through coordinated toxic inputs; toxic content will never reach the user, even if produced by the underlying AI, so systems can adjust their behavior dynamically based on conversation content and escalating risks. Unguarded AI can be exploited, which leads to reputational damage. 

For toxicity evaluation, PyCharm’s Hugging Face Insert HF Model feature helps you discover classifiers like s-nlp/roberta_toxicity_classifier directly in the IDE. Hovering over the model name reveals its model card, where you can see it was trained on the Jigsaw toxic comment datasets, helping you understand what the model can and can’t detect before you write a single line of evaluation code. 

PyCharm's Hugging Face Insert HF Model featureOpening the Hugging Face model browser in PyCharm from the Code menu, then selecting the Insert HF Model. PyCharm's Searching for a specific toxicity model and selecting one. Use Model inserts a ready-to-use code snippet into the editor. A ready-to-use code snippet of the roberta_toxicity_classifier is inserted into the editor. Hovering over the roberta_toxicity_classifier in the code to preview its model card within PyCharm.

Frameworks for LLM evaluation

Frameworks for LLM evaluation have changed the game; teams don’t have to rely on manual reviews, gut instinct, and subjective judgment to assess model quality. These frameworks automate the measurement of model quality using standardized, quantifiable metrics. They assign numerical scores to outputs that measure faithfulness, relevancy, toxicity, and other important dimensions. This automation results in reproducibility, speed, and objectivity. 

Consequently, the same input always produces the same score; evaluation runs 10–100 times faster, so in minutes instead of days; and there are no more debates on the quality of the output. Some of these frameworks include DeepEval and Retrieval Augmented Generation Assessment (Ragas). DeepEval is an open-source evaluation framework built with seven principles in mind, such as the ability to easily “unit test” LLM outputs in a similar way to Pytest and plug in and use over 50 LLM-evaluated metrics, most of which are backed by research and all of which are multimodal. 

It is extremely easy to build and iterate on LLM applications with two modes of evaluation, namely, end-to-end LLM evals and component-level LLM evals. It is used for comprehensive testing across RAG, agents, and chatbots. Ragas is a framework for reference-free evaluation of RAG pipelines. There are several dimensions to consider, such as the ability of the retrieval system to identify relevant and focused context passages, as well as the capability of the LLM to exploit such passages in a faithful way; hence, it is challenging to evaluate RAG systems. Ragas provides a suite of metrics for evaluating these dimensions without relying on ground-truth human annotations. 

The limits of static prompt evaluation

Traditional LLM evaluation methods are useful for single prompt-response pairs, measuring output quality, RAG systems with straightforward retrieval, and static evaluation with fixed inputs. But they are limited for multi-step agents because LLM evaluation focuses on the final output quality, not the decision-making process that produced it. Multi-step agents exhibit a different kind of complexity, as they chain multiple decisions.

Why traditional LLM evaluation isn’t enough for agents 

Agents operate independently within complex workflows, and this independence can introduce challenges such as deviation from expected behavior, errors in production, and more failure points than in traditional software applications. Hence, an agent can perform well in testing but fail in production. Traditional LLM evaluations don’t have the capacity to test such use cases. Testing is usually done in a controlled environment with limited scenarios, but production involves real users, edge cases, unpredictable inputs, and scale. This means that agents can make decisions that are not seen in testing, and in production, tasks could be completed, though incorrectly, without generating an error signal. This is where advanced evaluation and monitoring practices come to the rescue! They provide the visibility and systematic measurement needed to deploy agents confidently, rather than relying on trial and error.

The complexity of agent behavior

Traditional LLM evaluation measures single prompt-response pairs: provide an input prompt, receive an output response, and measure quality through metrics such as accuracy, relevance, and faithfulness. Due to the complexity and non-deterministic, multi-step reasoning of AI agents, they cannot be reliably evaluated using traditional evaluation metrics.

Agent behavior is complex, and this complexity introduces challenges. Agents operate in dynamic environments where APIs might be down, databases change between queries, and the “right” answer depends on current conditions. They can use external tools and APIs to complete tasks, and may either use the wrong tool or use the right tool with the wrong parameters or input type. Their internal reasoning traces remain hidden unless they are logged explicitly, so it might be challenging to determine whether an agent was successful through logic or chance. An agent’s output could be perfectly correct despite poor internal decisions, or the entire task could fail despite correct step execution.

This is where observability tooling becomes essential. PyCharm’s AI Agents Debugger breaks open the black box of agentic systems, letting you trace LangGraph workflows and inspect each agent node’s inputs, outputs, and reasoning directly in the IDE, with zero extra code. Just install the plugin, run your agent, and the debugger automatically captures execution traces. Click the Graph button to visualize the full workflow, making it easy to spot where an agent chose the wrong tool, passed bad parameters, or succeeded by luck rather than logic.

To see this in action, I built a simple travel-planning agent using LangGraph in two steps: a research node that suggests summer destinations based on my preferences, and a plan node that picks the best option and builds a three-day itinerary. With the AI Agents Debugger, you can trace exactly what information flowed between these two steps – what the research node suggested and how the planner used those suggestions to build the final itinerary.

The AI Agents Debugger shows how the agent moves from initialization to the research stage, displaying the data passed in and out, and the LLM call used to generate the research results. The AI Agents Debugger shows how the planning step processes inputs and produces outputs, using an LLM call to construct the final travel itinerary. The Graph viewprovides a high-level overview of the agent’s workflow, mapping how it progresses from the initial step through research and planning to the final result.

Advanced agent evaluation metrics

The complexity of AI agents demands evaluation that goes beyond considering the final output quality, that is, measuring whether it is accurate, relevant, and grounded. Specialized agent evaluation assesses the complete decision-making process, including the planning logic, tool selection, parameter construction, reasoning coherence, and resource efficiency that led to the final output. Hence, the advanced agent evaluation metrics are designed to make such a process visible and measurable. Some of them are task completion rate, tool usage, reasoning quality, efficiency, and error handling.

Task completion rate

Task completion rate measures the percentage of tasks where an agent successfully achieves the end goal. This is calculated as the number of completed tasks divided by the total number of tasks attempted. The context of “completed” differs by use case. There are real-world use cases for task completion rate. Let’s start with a basic use case. Consider a customer service agent handling a specific food delivery order: “Where is my order #0001? It has not been delivered to me.” Completion rate means successfully looking up the order ID, retrieving the tracking information, and providing an accurate delivery estimate, so all three steps must succeed. If the agent retrieves the wrong order or fails to assess the tracking system, that is a failed task, even if it produces the same output. 

Next, let us look at a medium-complexity use case, sequential API calls. Consider an agent tasked with creating a Jira support ticket and notifying the relevant team in Slack. The agent calls the Jira API to create a ticket, parses the response to get the ticket ID, calls the Slack API with the ticket link, and finally verifies the success of both. If the agent successfully creates the Jira ticket, but the Slack notification fails, that is considered a failed task even if the ticket exists in Jira, since the team wasn’t notified. 

Finally, let’s examine a high-complexity use case: An agent is given the task of completing an online purchase, which means it must handle everything from checkout to order confirmation. Six steps are involved: Verify the item is still in stock, process the payment with a credit or debit card, reserve or decrement inventory, create an order record, generate an order confirmation number, and send a confirmation email to the customer. If the agent successfully charges the customer’s card but the confirmation email fails to send, that’s a failed task, even if the payment was processed and the order was created. In such a situation, the customer has no proof of purchase, so they will likely contact support or attempt to purchase again.

Tool usage correctness

Tool usage correctness assesses whether an agent correctly identifies and invokes the relevant tools and APIs. It is a deterministic measure that is assessed using techniques such as LLM as a judge, like most LLM evaluation metrics. It has three dimensions: 

Hence, it is important for reliability and functional correctness. 

Step-by-step reasoning accuracy

In real-world use cases, an LLM agent’s reasoning is shaped by much more than just the model itself. Modern frameworks such as LangChain expose the agent’s internal “thoughts” through structured logging of intermediate reasoning steps. This is done using the ReAct (Reasoning and Acting) pattern, which involves the agent thinking about what to do, using a tool, observing the tool result, and then repeating until the task is complete. Each “thought” is logged as text, which creates a complete trace of the reasoning process from initial query to final answer. These traces can be extracted programmatically and evaluated to assess whether the agent’s logic is sound even when the final output appears correct. Evaluating planning steps involves assessing aspects such as the overall approach’s logic, the ordering of steps, and whether any steps are unnecessary or redundant. Evaluating execution assesses whether the implementation worked, such as whether tools were called with correct parameters, whether each step was completed successfully, whether errors were handled appropriately, and whether the output was interpreted correctly. This can be done seamlessly in PyCharm using the AI Agents Debugger.

Groundedness (faithfulness)

Groundedness, also known as faithfulness, is the most critical metric for retrieval-augmented generation (RAG), which is a common component of agentic applications. It assesses whether the agent’s response is actually supported by the retrieved source documents or whether, instead, the model hallucinated information. Different evaluation techniques include:

AI observability and why it matters

AI observability is about visibility into what the agent is doing. This covers recording everything that happens when a task is executed, including the agent’s reasoning at each step, which tools were called with what parameters, what data was retrieved, and how decisions were made from start to finish. With such a transparent system where every decision can be logged and traced, teams are able to understand why an agent fails, behaves unexpectedly, or becomes expensive to run because issues can be debugged and behavior can be audited. Consequently, system design improves, and guesswork is eliminated.

Definition of AI observability

AI observability is the real-time monitoring of agent actions, thoughts, and environmental interactions: what went in, what came out, how the agent thought through the problem, and which tools, APIs, and data were used. AI observability builds on the three pillars of DevOps observability – that is, metrics, logs, and traces – but extends each one for AI’s unique needs. DevOps metrics track CPU and latency, while AI metrics track token usage and cost per interaction. DevOps logs capture system errors, while AI logs capture reasoning traces and decision points. DevOps traces follow requests through services, while AI traces follow reasoning through agent steps, tool calls, and observations.

Benefits for agent monitoring

Agent monitoring has immense benefits – here are some of the most important:

Popular tools for agent monitoring

Several frameworks and platforms have emerged to provide built-in observability for AI agents, with each having different strengths and integration approaches and matching different features and requirements. The choice of the right tool depends on the framework, deployment preferences, and primary needs. The table below shows some popular tools and whether they match different features and requirements.

ToolTraces agent steps?Tracks costs?Detects regressions? Self-hostable?Open source?Easy integration?
HeliconeYesYesYesYesYesYes
LangSmithYesYesYesLimitedNoYes
LangFuseYesYesYesYesYesModerate
OpenLLMetryYesLimitedLimitedYesYesModerate
PhoenixYesLimitedYesYesYesModerate
TruLensYesLimitedYesYesYesModerate
DataDogLimitedYesYesNoNoModerate

Best practices for evaluating agents in production

Evaluation does not end after deployment; rather, it is intensified. This continuous evaluation tracks how much the system costs to run, how quickly it responds under various loads, and how it handles errors or unusual inputs. Without such evaluation, problems can only be identified after the users are affected. An agent can pass all the quality checks with excellent faithfulness scores, high completion rates, and strong reasoning but fail in production if costs spiral, latency increases, or edge cases cause instability. Hence, there is a critical need for ongoing evaluation and monitoring, which will lead to systems that are reliable, scalable, and financially sustainable.

Monitor cost and latency

Monitoring cost and latency is critical for production sustainability. Token usage and response time must be tracked continuously because small inefficiencies compound dramatically over time, and the cost per token of the powerful reasoning models used for agents can be high. Production workloads require cost and latency monitoring to identify problems before user experience and budget are impacted. Cost monitoring tracks token usage at different levels, such as per request, per query type, and over time. Without visibility into patterns generated by these, teams end up discovering cost problems through surprise bills. With monitoring, they can proactively cache common queries and optimize prompts to reduce token use. Latency monitoring reveals track response time and component breakdowns to identify bottlenecks.

Cost control in production workloads is important because production costs can spiral quickly, unmonitored systems can exceed budgets, and latency impacts user experience and retention.

Combine offline and online evaluation

Effective agent evaluation requires combining offline and online evaluation, where each addresses gaps the other leaves. Offline evaluation uses fixed test databases for reproducible benchmarking, which enables fast iteration on prompts and models in controlled environments without production risk. Online evaluation monitors real user interactions in production, which reveals edge cases in testing that were never expected, so it is useful for real-time feedback, user data, and observability tools. A combination of both results in an optimal strategy where offline evaluation validates changes before deployment, then online evaluation monitors production reality. 

Use human-in-the-loop when necessary

LLM agents are appreciated for how they have played a positive role in the different ecosystems, but not every agent should run autonomously since they can misinterpret prompts, cross boundaries, or make dreadful errors that can’t be caught by automation alone. Hence, the need for human-in-the-loop failsafes. Human-in-the-loop is also essential during initial setup: Unless teams already have domain-specific evaluation datasets for monitoring the agent, these will need to be created manually by assessing the agent’s performance. A hybrid approach is required when critical decisions require human validation, such as approving transactions, modifying sensitive data, or triggering irreversible workflows. In this approach, it is important that decisions are routed through a human checkpoint before proceeding. The intention is not to slow automation but rather to ensure that the right decisions involve the right oversight. A well-designed human-in-the-loop system delivers compound returns over time. Every human correction becomes feedback, which improves the agent’s accuracy and gradually reduces the need for manual review. Human oversight isn’t treated as a failure but rather as a safety net that makes the system better with use.

Final thoughts

Fundamentally, AI agents are different from single-prompt LLMs. They navigate multi-step workflows, make autonomous decisions, and use external tools, which introduces complexities that demand continuous evaluation, not just static testing. Evaluation must evolve from pre-deployment checkpoints to ongoing monitoring. Production-ready agents aren’t just well-tested; they’re continuously observed and improved based on real behavior. LLM evaluation and AI observability enable faster, safer iteration by catching issues early and feeding production insights back into development.

PyCharm streamlines agent development with integrated debugging, profiling, and testing. Step through reasoning with breakpoints, find cost bottlenecks, and iterate on evaluation tests rapidly. These workflows transform hours of debugging into minutes of systematic investigation. Explore PyCharm for AI development to see how integrated tools can help you build, evaluate, and deploy reliable AI agents.

About the author

Naa Ashiorkor

Naa Ashiorkor is a data scientist and tech community builder. She is deeply involved in the Python community and serves as an organizer for various conferences, including EuroPython. She is currently building PyLadies Tampere.

May 19, 2026 09:46 AM UTC

May 18, 2026


Kay Hayen

Nuitka Release 4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.

Bug Fixes

Package Support

New Features

Optimization

Anti-Bloat

Organizational

Tests

Cleanups

Summary

This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.

The --project option seems usable now.

Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.

May 18, 2026 10:00 PM UTC


Ari Lamstein

How Remote Work Has Grown — and Shrunk — Since Covid

Remote work surged during Covid — and while it has declined since, it’s still far above pre‑pandemic levels. I just updated my Covid Demographics Explorer with the latest ACS data, and the national trend is striking:

Remote work more than tripled between 2019 and 2021, rising to nearly 28 million people at the height of the pandemic. Since then it has edged down each year, but only modestly. Even today, at about 22 million, it remains roughly 2.5 times the pre‑Covid level.

The app now lets you generate this same graph for every state, as well as for counties and cities with populations of at least 65,000. See how the trend looks where you live.

Exploring Local Trends

I also added a “Compare Years” tab that lets you see which locations saw the biggest change in remote work between any two years. The national trend tells one story, but the local data tells another: the rise and fall of remote work played out very unevenly across the country. Below I run this analysis twice: first for the national increase from 2019-2021, and then for the gradual decline between 2021 and 2024.

The Remote Work Spike: 2019-2021

Between 2019 and 2021, the location that increased the number of remote workers the most was Sunnyvale, California. The number of remote workers there increased almost 11x in two years, from an estimated 3,235 to 38,319. Sunnyvale is in the heart of Silicon Valley, and tech companies were among the fastest to adopt remote work, which helps explain this result:

The scatterplot also shows the broader pattern: most locations cluster between a 150% and 300% increase in remote work during this period. That makes Sunnyvale’s nearly 1,100% jump stand out even more — it’s an order of magnitude beyond the national norm.

Interestingly, only one location in the entire dataset saw a decrease in remote work during this period: Rice County, Minnesota (-7.5%). It’s the lone point below zero on the chart, and I don’t have a clear explanation for it.

The Remote Work Decline: 2021-2024

When we run this same analysis for 2021–2024, we see a very different result: Sunnyvale’s remote workforce shrank by 67.2%, the largest drop in the dataset. This means that Sunnyvale saw both the largest increase between 2019 and 2021 and the largest decrease between 2021 and 2024:

The scatterplot also shows how different the overall pattern is in this period. Instead of large increases, most locations cluster between a 10% and 30% decline in remote work — a sharp contrast with the 2019–2021 graph, where nearly every location saw a substantial increase.

Against this backdrop, Sunnyvale’s 67% drop stands out as an outlier. The likely explanation is the wave of return‑to‑office mandates that swept through the tech industry during this period. The two other largest decreases also happened in Silicon Valley: the city of Fremont (–61%) and Santa Clara County (–56%).

At the other end of the distribution, the few places that saw increases tend to be warm‑weather, high‑amenity destinations: Marion County, Florida (69%), Collier County, Florida (65%), and Maui County, Hawaii (57%) saw the largest gains. These increases may reflect people with remote‑work jobs relocating to places with natural beauty and a high quality of life — a very different dynamic from the employer‑driven declines we see in Silicon Valley.

Conclusion

Three years after the peak, roughly 22 million Americans still work from home — more than double the pre-pandemic baseline. But the story is more complex than a single national number: a dramatic surge, an uneven retreat, and striking differences across the country. How does your corner of the country fit in?

The new version of the Covid Demographics Explorer makes it easy to explore these patterns yourself. In addition to remote‑work trends, you can examine changes in population, median household income, median rent, and public assistance. Analyze your own location.

This app was built in Python with the Streamlit framework. I teach Streamlit for O’Reilly — and if you’d like to learn to build apps like this yourself, I offer a free 7-day email course. Sign up in the form below.

May 18, 2026 08:00 PM UTC


Real Python

Python Built-in Functions: A Complete Guide

Python’s built-in functions are predefined functions you can use anywhere in your code without any imports. They handle common tasks across math, data type creation, iterable processing, and input and output. Knowing which ones to reach for makes your code shorter and more Pythonic.

In this tutorial, you’ll:

  • Recognize Python’s built-in functions and the built-in scope they live in
  • Use the right built-in for math, data types, iterables, and I/O tasks
  • Tell apart true functions and classes that look like functions
  • Apply built-ins to solve practical problems without reinventing the wheel

To get the most out of this tutorial, you’ll need to be familiar with Python programming, including topics like working with built-in data types, functions, classes, decorators, scopes, and the import system.

Get Your Code: Click here to download the free sample code that shows you how to use Python’s built-in functions.

Get the PDF Guide: Click here to download a free PDF guide that gives you a complete overview of Python’s built-in functions and how to use them.

Take the Quiz: Test your knowledge with our interactive “Python Built-in Functions: A Complete Guide” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Python Built-in Functions: A Complete Guide

Test your understanding of Python's built-in functions for math, data types, iterables, and I/O—and when to reach for each one.

Built-in Functions in Python

Python has several functions available for you to use directly from anywhere in your code. These functions are known as built-in functions and they cover many common programming problems, from mathematical computations to Python-specific features.

Note: All these functions live in the builtins module, which Python loads at startup and exposes through the built-in scope, so you can use them anywhere without importing the module. Importing the module explicitly is useful if you know that you’ll shadow a built-in name with one of your own variables or functions. Doing so keeps the original within reach as builtins.name.

Among these built-ins, you’ll also find classes with function-style names like str, tuple, list, and dict, which define built-in data types. These classes are listed in the Python documentation as built-in functions, so they’re covered in this tutorial too.

In this tutorial, you’ll learn the basics of Python’s built-in functions. By the end, you’ll know what their use cases are and how they work. You’ll start with the built-in functions for math computations.

In Python, you’ll find a few built-in functions that take care of common math operations, like computing the absolute value of a number, calculating powers, and more. Here’s a summary of the math-related built-in functions in Python:

Function Description
abs() Calculates the absolute value of a number
divmod() Computes the quotient and remainder of integer division
max() Finds the largest of the given arguments or items in an iterable
min() Finds the smallest of the given arguments or items in an iterable
pow() Raises a number to a power
round() Rounds a floating-point value
sum() Sums the values in an iterable

In the following sections, you’ll learn how these functions work and how to use them in your Python code.

Getting the Absolute Value of a Number: abs()

The absolute value or modulus of a real number is its non-negative value. In other words, the absolute value is the number without its sign. For example, the absolute value of -5 is 5, and the absolute value of 5 is also 5.

Note: To learn more about abs(), check out the How to Find an Absolute Value in Python tutorial.

Python’s built-in abs() function allows you to quickly compute the absolute value of a number. Here’s its signature:

Language: Python Syntax
abs(number)

The number argument can be any numeric value, including integers, floating-point numbers, complex numbers, fractions, and decimals. Take a look at a few examples:

Language: Python
>>> from decimal import Decimal
>>> from fractions import Fraction

>>> abs(-42)
42
>>> abs(42)
42

>>> abs(-42.42)
42.42
>>> abs(42.42)
42.42

>>> abs(complex("-2+3j"))
3.605551275463989
>>> abs(complex("2+3j"))
3.605551275463989

>>> abs(Fraction("-1/2"))
Fraction(1, 2)
>>> abs(Fraction("1/2"))
Fraction(1, 2)

>>> abs(Decimal("-0.5"))
Decimal('0.5')
>>> abs(Decimal("0.5"))
Decimal('0.5')

In these examples, you compute the absolute value of different numeric types using the abs() function. First, you use integer numbers, then floating-point and complex numbers, and finally, fractional and decimal numbers. In all cases, when you call the function with a negative value, the final result removes the sign.

For a practical example, say that you need to compute the total profits and losses of your company from a month’s transactions:

Language: Python
>>> transactions = [-200, 300, -100, 500]

>>> incomes = sum(income for income in transactions if income > 0)
>>> expenses = abs(
...     sum(expense for expense in transactions if expense < 0)
... )

>>> print(f"Total incomes: ${incomes}")
Total incomes: $800
>>> print(f"Total expenses: ${expenses}")
Total expenses: $300
>>> print(f"Total profit: ${incomes - expenses}")
Total profit: $500

Read the full article at https://realpython.com/python-built-in-functions/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 18, 2026 02:00 PM UTC


Python Bytes

#480 Proud Parents

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://www.better-simple.com/django/2026/05/06/using-django-tasks-in-production/?featured_on=pythonbytes">Using Django Tasks in production</a></strong></li> <li><strong>Co-authored with Claude?</strong></li> <li><strong><a href="https://rushter.com/blog/pypi-packages/?featured_on=pythonbytes">PyPI packages are increasing rapidly</a></strong></li> <li><strong><a href="https://tildeweb.nl/~michiel/httpx2.html?featured_on=pythonbytes">httpx2</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=-x1R3S72gCU' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="480">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a> <strong>Connect with the hosts</strong></li> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky) Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</li> </ul> <p><strong>Brian #1: <a href="https://www.better-simple.com/django/2026/05/06/using-django-tasks-in-production/?featured_on=pythonbytes">Using Django Tasks in production</a></strong></p> <ul> <li>Tim Schilling shares how the Djangonaut Space website has been using Django’s new tasks framework and some of the info missing from the official Django docs.</li> <li>Tasks require a third party package, <a href="https://github.com/RealOrangeOne/django-tasks-db?featured_on=pythonbytes"><code>django-tasks-db</code></a> to actually run the tasks.</li> <li>Article walks through all changes necessary to get an email process running to notify admins of new testimonials. Cool simple example.</li> <li>With the db backend, you can monitor progress of tasks in the admin, to see which tasks are scheduled, completed, or have errors.</li> <li>Some wishes for the community to implement <ul> <li>new tutorial in the Django docs</li> <li>Django Debug toolbar panel for tasks</li> <li>test/mock backend</li> </ul></li> <li>Great title for wish list: Thinks I’d like to see, but I’m too lazy to implement myself.</li> </ul> <p><strong>Michael #2: Co-authored with Claude?</strong></p> <ul> <li>Via Nik T.</li> <li>We don’t put “executed on macOS”, “edited with PyCharm”, etc. in our commits. Why Claude?</li> <li>Seems like a growth hack to me, that I don’t really care to participate in.</li> <li>Some projects that have formalized their thoughts on this: <a href="https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/?featured_on=pythonbytes">The Generative AI Policy Landscape in Open Source</a></li> <li>Adjust to turn off in <code>~/.claude/settings.json</code> see <a href="https://code.claude.com/docs/en/settings#attribution-settings">the docs</a>. <div class="codehilite"> <pre><span></span><code><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;attribution&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span> <span class="w"> </span><span class="nt">&quot;commit&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;&quot;</span><span class="p">,</span> <span class="w"> </span><span class="nt">&quot;pr&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;&quot;</span> <span class="w"> </span><span class="p">}</span> <span class="p">}</span> </code></pre> </div></li> </ul> <p><strong>Brian #3: <a href="https://rushter.com/blog/pypi-packages/?featured_on=pythonbytes">PyPI packages are increasing rapidly</a></strong></p> <ul> <li>Artem Golubin</li> <li>There’s been an increase of published packages per week on PyPI</li> <li>A pretty big increase in the last handful of months.</li> <li>30% increase since 2025, clearly due to AI</li> <li>Artem is building <a href="https://github.com/rushter/hexora?featured_on=pythonbytes">hexora</a>, a malicious Python code detector.</li> <li>Cool package too, it can: <ul> <li>Audit project dependencies to catch potential supply-chain attacks</li> <li>Detect malicious scripts found on platforms like Pastebin, GitHub, or open directories</li> <li>Analyze IoC files from past security incidents</li> <li>Audit new packages uploaded to PyPi.</li> </ul></li> <li>Artem is using hexora to analyze recently published pypi packages and many are obviously vibecoded and trigger false positives for abuses of <code>eval</code>, <code>exec</code>, and <code>subprocess</code> <ul> <li>Side note: I don’t think that’s necessarily a false positive. Not malicious, but maybe a stupid-code-detector?</li> </ul></li> <li>Lots are LLM related, Lots have bots contributing code</li> <li>Publishing rate is crazy, dozens to hundreds of published versions in a day is a bug, not a feature</li> <li>Brian’s proposal, PyPI should limit releases per day for any package to something a sane human would do, even if they make a mistake on a release, to maybe like 2-3, definitely under 10, in a day. And if the repo has obvious agent contributors listed, maybe lower to the limit to 1-2 a day? Honestly, “move fast and break things” doesn’t apply to breaking the commons.</li> </ul> <p><strong>Michael #4: <a href="https://tildeweb.nl/~michiel/httpx2.html?featured_on=pythonbytes">httpx2</a></strong></p> <ul> <li>More on the httpx, httpxyz, etc changes: Pydantic people started their own fork, <a href="https://github.com/pydantic/httpx2?featured_on=pythonbytes">httpx2</a>.</li> <li>Michiel says “while we think httpxyz was definitely needed, we welcome httpx2 and think it should be the ‘blessed’ fork.”</li> <li>Kludex, who is among other things maintainer of Starlette, was considering a fork</li> <li>As it stands, httpx2 is lacking the performance improvements they added to httpxyz. But it will not be long before they will add those, too.</li> <li>Also they already made some smart decisions: <ul> <li>they are switching from certifi to <a href="https://github.com/pydantic/httpx2/pull/209?featured_on=pythonbytes">truststore</a></li> <li>they are switching to <a href="https://github.com/pydantic/httpx2/pull/933?featured_on=pythonbytes">compression.zstd</a> on Python 3.14+, enabling zstd compression by default</li> <li>they <a href="https://github.com/pydantic/httpx2/commit/160c7f59d7942efe0133516c161d39139780eb45?featured_on=pythonbytes">merged httpcore</a> and vendored it in their repository</li> </ul></li> <li><a href="https://news.ycombinator.com/item?id=48127570&featured_on=pythonbytes">Discussion on Hacker News</a></li> </ul> <p><strong>Extras</strong></p> <p>Brian:</p> <ul> <li><a href="https://anarc.at/blog/2026-05-16-four-horsemen/?featured_on=pythonbytes">The Four Horsemen of the LLM Apocalypse</a> - Anarcat</li> <li><a href="https://www.djangoproject.com/weblog/2026/may/12/2026-django-developers-survey/?featured_on=pythonbytes">Django/JetBrains 2026 developer survey</a> is open</li> <li><a href="https://pyrefly.org/blog/v1.0/?featured_on=pythonbytes">Pyrefly 1.0</a> : “meaning we are confident that Pyrefly is ready for production use.” Michael:</li> <li>Just about ready to release Python Web Security: OWASP Top 10 with Agentic AI course. Be sure to be on <a href="https://training.talkpython.fm/getnotified?featured_on=pythonbytes">the courses newsletter</a> to get notified.</li> </ul> <p><strong>Joke:</strong> <a href="https://x.com/PR0GRAMMERHUM0R/status/1973145866962665752?featured_on=pythonbytes">Proud Parents</a></p>

May 18, 2026 08:00 AM UTC


Core Dispatch

Core Dispatch #4

Welcome back to Core Dispatch! This edition covers April 30 through May 18, 2026. Python 3.15.0 beta 1 is officially here, which means CPython's main branch is now open for 3.16 work. The first 3.16 alpha is slated for mid-October. More imminently, beta 2 is up next on June 2, with 3.13.14 and 3.14.6 following on June 9.

This is also PyCon US week, so a lot of the core team is gathered in Long Beach right now. Once recordings are available, we'll be sure to pull talks from folks on the team into a future edition.

PEP 788 has also moved from accepted to implemented, and free-threaded builds picked up thread-safe iterator support. There are also a few smaller but concrete fixes: http.server can send custom headers from the command line, AttributeError can suggest Python equivalents for method names from other languages, webbrowser on macOS is moving away from osascript, and ftplib.ftpcp() picked up the PASV CVE fix.

If you maintain a package or just like living on the edge, give the latest 3.15 beta a spin and file any issues you find.

Upcoming Releases

Official News

Merged PRs

Discussion

Upcoming CFPs & Conferences

One More Thing

"It's oompa loompa shit"

Pablo Galindo Salgado

Credits

May 18, 2026 12:00 AM UTC

May 17, 2026


Artem Golubin

PyPI packages are increasing rapidly

PyPI is the main repository for Python packages. One thing that I've noticed recently is the number of published packages per week.

Let's look at published counts of new package versions per week:

There are some dips in the data, but that's because of how the data was collected. We can see a clear increase in the number of published packages, especially in the last few months.

Because of AI, the number of packages published per week has increased by 30% since 2025.

I'm working on hexora, a library that detects malicious Python code in packages.[......]

May 17, 2026 01:37 PM UTC

May 16, 2026


PyCon

Welcome Back, NVIDIA: Visionary Sponsor of PyCon US 2026

NVIDIA is excited to once again support PyCon US 2026 as a Visionary Sponsor, and to sponsor the Future of AI with Python Conference Track.

Python is a “first-class” language at NVIDIA CUDA, and NVIDIA is committed to bringing our technology to Python developers in close alignment with C++ upon new releases of our hardware. We’re also happy to announce the general availability of CUDA Python 1.0.

NVIDIA’s commitment to Python goes well beyond just our own tech stack. NVIDIA’s Python engineers contribute across a broad swath of the Python ecosystem, from the core interpreter itself, to packaging and PyPI, to the Python community at large. NVIDIA is inspired by the energy of, and privileged to collaborate with, people across the open source Python community.

Since PyCon last year, NVIDIA Pythonistas – in collaboration with many others in the Python community – have made great progress on the evolution of various packaging standards, including working with community partners on the implementation of wheel variants and the establishment of a Packaging Council to better govern the evolution of packaging standards and PyPI. NVIDIA Python engineers are also engaged in implementation, testing, and porting work for the free-threaded build of the interpreter. NVIDIA Python engineers are driving the early exploratory work for adopting Rust for CPython, work on Python performance benchmarking, and are actively involved in many enhancements for Python 3.14 and 3.15, including providing built-in Zstandard support in Python 3.14.

At NVIDIA, we are excited to work with our partners and the open source Python community to help bring the best developer experience for users of high performance computing and AI. Come see NVIDIA at the Anaconda and PyTorch booths, and at the AI Track.

Barry Warsaw
May 2026
Principal System Software Engineer, NVIDIA
Python Core Developer since 1994
Python Steering Council member in 2026

May 16, 2026 02:30 PM UTC

May 15, 2026


Anarcat

The Four Horsemen of the LLM Apocalypse

I have been battling Large Language Models (LLM1) for the past couple of weeks and have struggled to think about what it means and how to deal with its fallout.

Because the fight has come from many fronts, I've come to articulate this in terms of the Four Horsemen of the Apocalypse.

Sound track: Metallica's The Four Horsemen, preferably downloaded from Napster around 2000, but now I guess you get it on YouTube.

War: bot armies

Let's start with War. We've been battling bot armies for control of our GitLab server for a while. Bots crawl virtually infinite endpoints on our Git repositories (as opposed to downloading an archive or shallow clone), including our fork of Firefox, Tor Browser, a massive repository.

At first, we've tried various methods: robots.txt, blocking user agents, and finally blocking entire networks. I wrote asncounter. It worked for a while.

But now, blocking entire networks doesn't work: they come back some other way, typically through shady proxy networks, which is kind of ironic considering we're essentially running the largest proxy network of the world.

Out of desperation, we've forced users to use cookies when visiting our site. We haven't deployed Anubis yet, as we worry that bots have broken Anubis anyways and that it does not really defend against a well-funded attacker, something which Pretix warned against in 2025 already.

(We have a whole discussion regarding those tools here.)

But even that, predictably, has failed. I suspect what we consider bots are now really agents. They run full web browsers, JavaScript included, so a feeble cookie is no match for the massive bot armies.

Side note on LLM "order of battle"

We often underestimate the size of that army. The cloud was huge even before LLMs, serving about two thirds of the web. Even larger swaths of clients like government and corporate databases have all moved to the cloud, in shared, but private infrastructure with massive spare capacity that is readily available to anyone who pays.

LLMs have made the problem worse by dramatically expanding the capacity of the "cloud". We now have data centers that defy imagination with millions of cores, petabytes of memory, exabytes of storage.

I thought that 25 gigabit residential internet in Switzerland could bring balance, but this is nothing compared to the scale of those data centers.

Those companies can launch thousands, if not millions of fully functional web browsers at our servers. Computing power or bandwidth are not a limitation for them, our primitive infrastructure is. No one but hyperscalers can deal with this kind of load, and I suspect that they are also struggling, as even Google is deploying extreme mechanisms in reCAPTCHA.

This is the largest attack on the internet since the Morris worm but while Robert Tappan Morris went to jail on a felony, LLM companies are celebrated as innovators and will soon be too big to fail.2

Which brings us to the second horsemen, famine.

Famine: shortages

All that computing power doesn't come out of thin air: it needs massive amounts of hardware, power, and cooling.

Earlier this year, I've heard from a colleague that their Dell supplier refused to even provide a quote before August. Dell!

In February, Western Digital's hard drive production for 2026 was already sold out. Hard drives essentially doubled in price within a year, and some have now tripled. A server quote we had in November has now quadrupled, going from 10 thousand to FORTY thousand dollars for a single server.

But regular folks are facing real-life shortages as well, as city-size data centers are being built at neck-breaking speed, stealing fresh water and energy from human beings to feed the war machine.

We've been scared of losing our jobs, but it seems that Apocalypse has yet to fully materialize. Regardless for engineers, the market feels tighter than it was a couple years ago, and everyone feels on edge that they will just have to learn to operate LLMs to keep their jobs.

Which brings us, of course, to Death.

Death: security and copyright

Our third horseman is one I did not expect a couple of months ago. Back at FOSDEM, curl's maintainer Daniel Stenberg famously complained about the poor quality of LLM-generated reports but then, a few months later, everyone is scrambling to deal with floods of good reports.

In the past two weeks, this culminated in a significant number of critical security issues across multiple projects. Chained together, remote code execution vulnerabilities in Nginx and Apache and two local privilege escalations in the Linux kernel (dirtyfrag and fragnesia) essentially gave anyone root access to any unpatched server to the web.

As I write this, another vulnerability dropped, which gives read access to any file to a local user, compromising TLS and SSH private keys.

All those vulnerabilities were released without any significant coordination while people scrambled to mitigate.

Many people including Linus Torvalds are now considering issues discovered through LLMs to be essentially public. This puts some debates about disclosure processes in perspective, to say the least.

But this is not merely the death of the traditional coordinated disclosure process, the C programming language, or the Linux kernel: remember that those bots are trained on a large corpus of copyrighted material. Facebook has trained their models on pirated books and Nvidia has done deals with Anna's Archive to secure access to large swaths of copyrighted material. The US Congress seems to think LLM outputs are not copyrightable, like any other machine outputs.

With many people now vibe coding their way out of learning or remembering how computers work, is this the Death of Copyright?

And that, of course, brings us to the final horseman: Pestilence.

Pestilence: slop

There is a growing meme that programming is essentially over as we know it. That you can simply vibe-code applications from scratch and it's pretty good.

Maybe that's true.

So far, most of my attempts at resolving any complex problem with a LLM have often failed with bizarre failures. Some worked surprisingly well. Maybe, of course, I am holding it wrong.

I personally don't believe LLMs will ever be good enough to produce and maintain software at scale. They're surprisingly good at finding security flaws right now. But what I see is also a lot of Bullshit, with a capital B. It's not lying: it does not "know" anything, so it can't lie. It's misleadingly cohesive and deliberate, but it lacks meaning, intent, will.

I have not been confronted with much slop, apart from the lobster Jesus or the yellow man atrocities, and particularly not in my work. But I see what it is doing to my profession: beyond vibe-coding, people are now token-maxxing, and land-grabbing their colleagues.

I don't like what LLMs do to our communities, or the fabric of software we live with.

Software does not evolve in a void. It is a team effort, be it free software or a corporate product. Generations of humans have carefully built the scaffolding of technology required for modern networks and software to operate, in a convoluted contraption that no single human fully understands anymore.

The idea of simply giving up on that understanding entirely and delegating it to an unproven model is not only chilling, it feels just plain stupid. Not stupid as in Skynet, stupid as in "I can't get inside the data center because the authentication system is down". Except we're in a "the power plant doesn't reboot" or "their LLM found an 0day in our slop" kind of stupid.

The fifth horsemen

Researching for this article, I looked up the four horsemen and found out they original seems to have been:

I was surprised. I grew up thinking about the horsemen being Famine, War, Pestilence, and Death. So I went back to my original source which actually claims the horsemen are:

Time has taken its toll on you, the lines that crack your face.
Famine, your body, it has torn through, withered in every place.
Pestilence for what you've had to endure, and what you have put others through
Death, deliverance for you, for sure, now there's nothing you can do

So I guess that makes no sense either, which, fair enough, I shouldn't rely on Metallica for theological references. Especially since that song was originally called Mechanix and was "about having sex at a gas station".

Anyways.

The point is, there are actually five horsemen, and the fifth one is, in my opinion, Conquest.

Those companies (and not "AI", mind you) are taking over the world. I sense a strong connection with the "post-truth" world imposed on us by fascists like Trump and Putin. It's not an accident, it's a power grab part of the Californian Ideology3. Just like Airbnb broke housing, Uber destroyed the transportation and Amazon is taking over retail and server hosting, LLM companies are essentially trying to take over if not everything, at least Cognition as a whole.

But the capitalization of those companies (OpenAI and Nvidia in particular) are so far beyond reason that their inevitable collapse will likely lead to a global financial collapse of biblical proportions.

Because they will inevitably fail like previous bubbles they are built on. And when they fail, I hope it zips all the way back through the blockchain scam, the ad surveillance system, and the dot com then git me back my internet.

The Tower of Babel

While I'm off in the woods hallucinating (ha!) on biblical allegories, I feel there's another sign that the apocalypse is coming.

The Tower of Babel myth says that humans tried to create a big tower up to heaven and become god. God confounds their speech and scatters the human race. End of utopia.

This is what is happening to our human translators now. LLMs being, after all, Language Models, they are excellent at translation work. So much that the only translators not replaced by LLMs right now are interpreters, who translate vocally in real time. But interpreters are worried about their jobs as well.

This concretely means we will lose the human capacity, as a civilization, to translate between each other. It is still an open question whether the remaining revision work will be enough for translators to avoid deskilling, but other research has shown that LLM use leads to cognitive decline, impacts critical thinking, and generally, that deskilling is a common outcome.

Ultimately, I think this is where LLMs bring us. Towards collapse.

So this is a call to arms. Fight back!

Poison bots. Build local real-world communities.

Go low tech. Moore's law is dead, make use of it.

Patch your shit. Go weird.

Refuse slop. Train your brain.

The horsemen will collapse, but let's not go down with them.

Butlerian Jihad!

This article was written without the use of a large language model and should not be used to train one.


  1. I prefer "LLM" to Artificial Intelligence, as I don't consider models to have "Intelligence" which goes far beyond the analytical traits we train models for. Intelligence requires embodiment and social interaction; machines lack the innate human skills of empathy, feeling and care, which explains a lot of the evils behind the current trends.
  2. It should be noted that Morris also happened to be one of the founder of Y Combinator where he is in good company with other techno-fascists like Peter Thiel, Sam Altman, and so on. Crime, after all, pays.
  3. Probably a good time to watch All Watched Over by Machines of Loving Grace.

May 15, 2026 09:25 PM UTC


PyCharm

Pyrefly LSP Integration with Type Engine in PyCharm 2026.1.2

In PyCharm 2026.1.2, you can enable Pyrefly as an external type provider, dramatically increasing the speed of the IDE’s code insight features.

What is the Pyrefly LSP?

“LSP” stands for the Language Server Protocol – a standardized protocol that allows code editors and IDEs to communicate with language servers. The LSP enables language servers to provide code intelligence features, such as:

The key benefit of the LSP is that it allows a single language server to be used across multiple tools. This means that language-specific intelligence does not have to be implemented separately in every editor, IDE, or CI pipeline.

Pyrefly is Meta’s next-generation Python type checker, engineered from the ground up in Rust to replace its predecessor, Pyre (written in OCaml). With the move to Rust, Pyrefly achieves significantly faster performance and improved cross-platform portability. More than just a rewrite, it is designed to be more capable and robust, offering an efficient toolset for maintaining large-scale Python codebases with high precision and minimal overhead.

Pyrefly provides the following benefits:

Pyrefly is highly beneficial for projects and developers dealing with large, complex Python codebases that prioritize performance and robust typing. Integrating Pyrefly via the LSP is part of our ongoing work to enhance code insight performance in PyCharm.

Using Pyrefly in PyCharm

Once enabled, Pyrefly powers all code insight functionality in PyCharm, including type inference and type-related diagnostics, quick documentation, and inlay hints. Delegating analysis to this faster engine delivers significantly improved performance.

To start using Pyrefly in your PyCharm project, go to the Type widget at the bottom of the window. By default, the IDE uses the built-in type engine. Click on the widget and select the option to use Pyrefly. If you do not have Pyrefly installed yet, PyCharm will install it automatically. 

Once you’ve switched to the Pyrefly type engine, you will see a Pyrefly icon at the bottom, which you can hover over to check the version being used.

Please note that the integration currently works for local interpreter configurations. Support for Docker, Docker Compose, WSL, SSH, and multi-module projects is planned for future releases.

Pyrefly vs. the built-in type engine

Now let’s look at how Pyrefly and the built-in type engine behave in a complex Python project. In this FastAPI example, multiple files are typed, but in this file, the variable ref is incorrectly typed, causing four errors. When using the built-in type engine, the IDE identifies that something is wrong, but it suggests running further analysis to fix the problem, which requires an extra step.

Using Pyrefly as the type engine, the IDE reports errors immediately and highlights where they originate. However, it is worth noting that, in our example, there are four errors, but Pyrefly picks up only three of them. It misses the one in self._storage[ref].

Download the latest version of PyCharm and try it out

Ready to experience a dramatic leap in Python development performance? The Pyrefly type engine in PyCharm 2026.1.2 delivers the next generation of type checking. Engineered in Rust for unparalleled speed, it resolves files in as little as 0.5–1 seconds, significantly faster than the built-in engine. If you maintain large, complex Python codebases and prioritize robust typing, this feature is essential, as it allows you to delegate analysis to a faster engine and receive immediate type-related diagnostics. Download the latest version of PyCharm (2026.1.2) to unlock superior efficiency, scalability, and code insight.

May 15, 2026 03:31 PM UTC


Real Python

The Real Python Podcast – Episode #295: Agentic Architecture: Why Files Aren't Always Enough

What are the limitations of using a file-based agent workflow? Why do massive context windows tend to collapse? This week on the show, Mikiko Bazeley from MongoDB joins us to discuss agentic architecture and context engineering.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 15, 2026 12:00 PM UTC

Quiz: Python's Array: Working With Numeric Data Efficiently

In this quiz, you’ll test your understanding of Python’s Array: Working With Numeric Data Efficiently.

By working through this quiz, you’ll revisit the differences between Python’s array module and the built-in list, the meaning of type codes, how to create and manipulate arrays as mutable sequences, and the performance trade-offs of using a low-level numeric container.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 15, 2026 12:00 PM UTC


EuroPython

May Newsletter: Sessions, Speakers, Sprints

Hi all Pythonistas! 👋 

Hope you’ve been enjoying these last few weeks, and hopefully planning your trip to Kraków in July! With two months left before the conference, the EuroPython organising team has been firing on all cylinders to create a conference to remember. Here’s the latest from us:

📋 Session and Speaker Lists Are Available

Our Programme Team is busy preparing a detailed schedule for you. We plan to release it in the upcoming days, but in the meantime we’ve got the list of sessions and speakers for you to check out. It’s going to be an exciting conference!

altLists of sessions and speakers are available at https://ep2026.europython.eu/

👉 All conference sessions: https://ep2026.europython.eu/sessions/

👉 Speakers and tutorial leads: https://ep2026.europython.eu/speakers/ 

🗻 Language & Rust Summits

Summits are an opportunity for project contributors to come together during EuroPython. These are invite-only events with limited capacity at the venue, so registration is required.

🐍 Language Summit

The Python Language Summit is an event for the developers of Python implementations (CPython, PyPy, MicroPython, GraalPython, IronPython, and so on) to share information, discuss our shared problems, and — hopefully — solve them.

These issues might be related to the language itself, the standard library, the development process, the status of Python 3.15 (and plans for 3.16), the documentation, packaging, the website, and so forth. The Summit focuses on discussions and consensus-seeking, more than merely on presentations.

👉 Register for the Language Summit: https://ep2026.europython.eu/language-summit/

⚙️ Rust Summit

This full-day summit is dedicated to exploring the intersection of Rust and the Python ecosystem. Attendees can expect an intensive schedule focused specifically on integrating Rust into Python projects and the development of high-performance Python tools (e.g., using technologies like PyO3, Maturin, or writing performant native extensions). 

This summit is designed for developers who already possess some practical experience in these topics and are looking to deepen their expertise, share lessons learned, and contribute to the community&aposs collective knowledge.

👉 Register for the Rust Summit: https://ep2026.europython.eu/session/rust-summit-at-europython

🗣️ Keynote Speakers

We are excited to announce a new keynote: 

Cover image of Leah Wasser, Executive Director and Founder of pyOpenSci, as a keynoter at EuroPythonLeah Wasser will deliver a keynote at EuroPython 2026

Leah Wasser is the Executive Director and founder of pyOpenSci, a community of 400+ researchers, engineers, and maintainers working to make developing and maintaining research software more accessible, sustainable, and human. She organizes the Maintainers Summit at PyCon US and believes the communities behind research software matter as much as the code itself.

Leah has built nationally recognized programs at the National Ecological Observatory Network (NEON) and the University of Colorado Boulder. Leah holds a PhD in ecology and is an active open source maintainer.

✋ Upcoming Call for Volunteers

We&aposre opening our Call for Volunteers next week! Want to be part of the team and help make EuroPython 2026 awesome? Keep an eye on the website, the signup form drops in just a few days. We&aposll be reviewing applications on a rolling basis, so don&apost wait – apply as soon as it goes live! Whether you&aposre a first-timer or a returning volunteer, we&aposd love to have you.

In my opinion, volunteering enriches the enjoyment of the whole event even further. There are many different roles to suit different personalities and abilities — one of them could suit you very well. Also, volunteering is about the team; you will not be left alone in any case.

Jake Balas, Onsite Volunteers Team Lead at EuroPython 2025 and this year’s Operations Team Lead

💙 Read our full interview with Jake https://blog.europython.eu/humans-of-ep-jake/

💰 Sponsorship: Diamond, Platinum, Silver Available 

If you&aposre passionate about supporting EuroPython and helping make this conference accessible to a diverse, global Python community, consider becoming a sponsor or asking your employer to join us in this effort.

By sponsoring EuroPython, you’re not just backing an event – you&aposre gaining highly targeted visibility that will present your company or personal brand to one of the largest and most diverse Python communities in the world! Here’s what one of our sponsors said about their experience at EuroPython 2025:

The Apify team shares their experience sponsoring EuroPython 2025

We still have some Diamond, Platinum, and Silver slots available. Along with our main packages, there are optional add-ons and extras to craft your brand messaging in exactly the way that you need. 

👉 More information at: https://ep2026.europython.eu/sponsorship/sponsor/ 

👉 Contact us at sponsoring@europython.eu

🚧 Speaker Orientation

Anyone interested in receiving speaker training from our experienced mentors is invited to an online workshop on the 3rd June 2026, at 18:00 CEST. We’ve designed the session for people of all experience levels, from first time speakers to seasoned presenters, and we still have spots for you.

👉 Register now to confirm your place: https://forms.gle/uZKwuAiBkUSmx7gn7

🤝 Community Partners

🇪🇸PyConES 

Barcelona is calling, Pythonistas! PyConES 2026 has extended its CFP. New deadline: 17 May, 23:59 CEST. If you’re still thinking about submitting a talk, workshop, or idea to the community which will meet up in that gorgeous city, you have last days.

👉 Submit the proposal for PyConES 2026 https://pretalx.com/pycones-2026/cfp 

🦬PyStok

PyStok #82 meetup lands on 20 May, 18:00 at Zmiana Klimatu in Białystok, Poland, and free registration is officially live. Grab your spot at https://pystok.org/najblizsze-wydarzenie to dive deep into RAG/LLM Wiki and the PLLuM (Polish Large Language Model) project. Between the "speed dating" networking, JetBrains giveaways and the legendary "Podlaskie afterparty", it’s the perfect spot to soak up those unique North-East Polish vibes and talk Python and AI with the local crowd.

📣 Community Outreach

🏖️PyCon US

Several members of the EuroPython Society have traveled across the ocean to join the biggest gathering of Pythonistas, which this year takes place in Long Beach, California. If you’re there this weekend, make sure to look up the EuroPython booth and say “hi” to the team!

🎁 Sponsor Spotlight

We&aposd like to thank Manychat for sponsoring EuroPython.

Manychat builds AI-powered chat automation for 1M+ creators and brands at real production scale.

altView job openings at Manychat

👋 Stay Connected

Follow us on social media and subscribe to our newsletter for all the updates:

👉 Sign up for the newsletter: https://blog.europython.eu/portal/signup

We’ll be announcing more keynotes in the upcoming days, and the detailed schedule will be available soon, so you can plan your conference experience. Just eight weeks are left before we all meet in the City of Castles and Dragons. See you there! 🐍❤️

Cheers,

The EuroPython Team

May 15, 2026 06:00 AM UTC

May 14, 2026


Real Python

Quiz: Cursor vs Windsurf: Which AI Code Editor Is Best for Python?

In this quiz, you’ll test your understanding of Cursor vs Windsurf: Which AI Code Editor Is Best for Python?

By working through these questions, you’ll revisit how the two editors differ across code completion, agentic multi-file editing, and debugging.

You’ll also reconnect with the audit points worth applying whenever an AI agent writes Python on your behalf.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 14, 2026 12:00 PM UTC

Quiz: Python Metaclasses

In this quiz, you’ll test your understanding of Python Metaclasses.

Metaclasses sit behind every class you write in Python, and they’re one of the language’s deeper object-oriented concepts. By working through this quiz, you’ll revisit how classes are themselves objects, how type creates them, and how a custom metaclass lets you customize class creation.

You’ll also reflect on when a custom metaclass is actually the right tool and when a simpler technique does the job better.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 14, 2026 12:00 PM UTC


Python Engineering at Microsoft

PyCon US 2026

Come See Us at PyCon US 2026!

Microsoft and GitHub will be at PyCon US 2026, May 14–17 in Long Beach, CA. Stop by our booth, say hello, and tell us about your experience with our tools and services. We’d love to meet you.

Don’t miss the Meta booth on Saturday at 1 p.m., where we’ll be showing off the integration of Pylance with Meta’s new Pyrefly type checker. The integration is currently in early preview in our Insiders build, and we can’t wait to bring it to all our users later this year.

Hands-on Labs at the Booth

Drop in for 10-minute interactive labs covering:

Talks and Sessions

Date & Time Room Session Speaker
Wed, May 13 · 9:00 a.m.–12:30 p.m. 101A Build your first MCP server in Python Pamela Fox
Wed, May 13 · 1:30 p.m.–2:30 p.m. 201B Dungeons and Databases: Build NPC agents to work with data in DocumentDB and Postgres (Microsoft Sponsor session) Marko Hotti, Patty Chow
Thu, May 14 · 2:40 p.m.–3:05 p.m. 104C Education Summit: Big Lessons from Small Models, Teaching Python AI with SLMs Gwyneth Peña-Siguenza
Thu, May 14 · 3:40 p.m.–4:05 p.m. 104C Education Summit: Your Slides, But Faster, Building an AI-powered presentation workflow Pamela Fox
Fri, May 15 · 3:30 p.m.–4:00 p.m. 104C PyCharlas: Cómo pasé de perdida a enseñar Python + IA a miles, en un año Gwyneth Peña-Siguenza
Sat, May 16 · 2:30 p.m.–3:45 p.m. 201A Maintainer Summit Tools Track: Dev Containers Sarah Kaiser
Sun, May 17 · 1:00 p.m.–1:30 p.m. Grand Ballroom A A bridge over (not) troubled waters: Collecting marine data from your couch Sarah Kaiser

Can’t wait to see you there!

The post PyCon US 2026 appeared first on Microsoft for Python Developers Blog.

May 14, 2026 12:18 AM UTC


Bob Belderbos

Learn agentic AI in Python with 10 small exercises

Most "build an AI agent" tutorials hand you a framework and skip the part where you actually understand what it's doing under the hood. When the abstraction breaks, you can't debug it because you never built the layer underneath. Juanjo and I think that gap is worth closing.

Yesterday we shipped 10 small browser-based exercises that walk through that layer one pattern at a time (more on how we run them in the browser with Pyodide here).

This article is the conceptual journey behind them: how you get from "I can call Claude" to a complete agent loop with a testable architecture and a human-in-the-loop workflow. Each stage builds on the previous one.

Stage 1: make a model reply (exercise 1)

Every agent app starts with the same 3-line skeleton. Build a client, call messages.create, read content[0].text. The shape doesn't change much. Only what wraps around it does.

import anthropic

client = anthropic.Anthropic()
msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[{"role": "user", "content": "Say hi"}],
)
print(msg.content[0].text)

Why content[0].text and not .text? Because content is a list of blocks (text, tool_use, and others). That list is how tool use plugs in later without breaking the response shape. Get this mental model before anything else.

Stage 2: make the reply machine-readable (exercises 2, 3)

Raw LLM strings are unreliable. The fix is two paired habits: a specific system prompt that locks the output shape, and a Pydantic model that validates it on the way back in.

from pydantic import BaseModel

class ExpenseResult(BaseModel):
    category: str
    confidence: float

result = ExpenseResult.model_validate_json(msg.content[0].text)

Treat the system prompt like an API contract. Say "JSON only", show the literal shape, forbid improvisation ("no punctuation, no explanation, nothing else"). The phrase "nothing else" is doing real work; without it, models love to append a friendly sentence that breaks your parser.

Stage 3: make it remember (exercise 4)

LLMs don't remember anything. They have no state, no memory, no context beyond the current call. The "conversation" is a fiction we create by sending the whole message history every time.

To get a continuous conversation, you keep the list of {"role": ..., "content": ...} dicts and send the whole thing every turn. Append the user message before the call, the assistant reply after. Roles must alternate.

history = []

def ask(user_msg):
    history.append({"role": "user", "content": user_msg})
    reply = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        messages=history,
    ).content[0].text
    history.append({"role": "assistant", "content": reply})
    return reply

State lives in your code, not the model. That single realization clears up most of the confusion students have about context windows and "memory."

Stage 4: give the model hands (exercise 5)

Tool use turns a chatbot into something that can act. The loop is dumber than people think:

while True:
    response = client.messages.create(..., tools=TOOLS, messages=messages)
    if response.stop_reason == "end_turn":
        return response.content[0].text
    # else: run the tool the model asked for, append the result, loop again

Two gotchas: append the full response.content as the assistant turn (it contains the tool_use blocks the model needs to see), and tool results come back wrapped in a user message, not assistant.

Stage 5: make it swappable and testable (exercises 6, 7, 8)

By exercise 6 the chatbot works, but it's also often a highly coupled mess importing external dependencies like anthropic and sqlite3 into the business logic. Time for three common patterns, applied to LLM apps:

That's the four-layer agent architecture, built piece by piece instead of dumped on you all at once.

Stage 6: keep a human in the loop (exercise 9)

When the model returns a confidence score, use it. Above the threshold: auto-accept. Below: show the suggestion and let the user confirm or override.

def process(result, threshold=0.8):
    if result.confidence >= threshold:
        return result.category
    answer = input(f"Accept '{result.category}'? (Enter to confirm): ").strip()
    return answer or result.category

Make the accept path the cheapest action (empty input or y). Users pay the manual handling cost only when overriding. This is what separates a trusted assistant from one that quietly mislabels things, and it's the gap between "AI demo" and production-ready workflow.

Stage 7: generalize the loop (exercise 10)

The agent is exercise 5 with one change: replace the hardcoded function call with a TOOL_FUNCTIONS[name] lookup.

TOOL_FUNCTIONS = {
    "add": lambda a, b: a + b,
    "multiply": lambda a, b: a * b,
}
# inside the loop:
content = str(TOOL_FUNCTIONS[block.name](**block.input))

Now adding a tool is one schema entry plus one dict entry. Swap add/multiply for search_web, query_db, send_email and the loop is identical. Look at agent frameworks under the hood (LangChain, OpenAI Assistants) and you'll see this same pattern.

What the journey teaches

Frameworks make sense once you can write the layer underneath. Skip that, and you are stuck the first time the abstraction leaks. After coaching many developers through this, the dividing line is clear: have they ever written the loop themselves?

The 10 exercises are deliberately small. The arc matters more than any single one. Once you've done them, "agentic AI" stops being "magic" and starts being a loop, schema, and some patterns you might already know.

Try them out:

  1. In the browser: pythonagenticai.com/exercises. No install, no API key, no dependencies. Loads fast.
  2. Locally: clone the repo and work through them in your IDE.

Keep reading

May 14, 2026 12:00 AM UTC