skip to navigation
skip to content

Planet Python

Last update: June 04, 2026 07:44 PM UTC

June 04, 2026


The Python Coding Stack

Down The Iterator Rabbit Hole

You know that street game where the performer (con artist?) has three opaque cups and a small ball. He places the cups upside down on the table, with the ball under one of the cups. He quickly shuffles the cups around and then asks the player to guess which cup has the ball. You’ve seen the game on TV, even if you’ve not seen it in real life.

Following what’s happening when you have a chain of iterators in Python can feel like playing that game. But, unlike the street game, there are no scams when you’re playing the iterator game. Let’s make sure you’ll always win.

I’ll keep this article short. I wrote many articles about iterables and iterators. If you need to refresh your memory, have a look at The Anatomy of a for Loop and A One-Way Stream of Data • Iterators in Python (Data Structure Categories #6).

Follow The Data in a Chain of Iterators

Let’s keep the example simple. Start with this list in a REPL session:

All code blocks are available in text format at the end of this article • #1

A list is iterable. You can create an iterator from any iterable. Let’s create an iterator from this list:

#2

The built-in function iter() creates an iterator from an iterable. Iterators don’t contain data. They don’t create copies of the data. They’re lightweight objects that create a stream. They’ll fetch data from the original source, which is the list boring_numbers in this case, as and when needed.

Iterators can only fetch an item once. So, they’re a one-way stream. Once you use an item, it’s gone from the iterator – but not from the original list, which remains unchanged.

Therefore, first_iter is an iterator that relies on data from the list boring_numbers. But let’s not fetch any items from the first_iter iterator. Not yet, anyway.

Create a second iterator. This time, you’ll use a generator expression. Generators are iterators, so you create a second iterator with this code:

#3

Note that the expression on the right-hand side of the equals sign is enclosed in parentheses – the round ones, to be clear. This is a generator expression, which creates a generator iterator. Read Pay As You Go • Generate Data Using Generators (Data Structure Categories #7) for more on generators.

As we said, generators are iterators.

The second_iter iterator generates data from first_iter, which is itself an iterator. Iterators are also iterable, which is why you can use them directly in a for clause or anywhere else you’d generally use an iterable. The second_iter iterator will yield the values as floats. But you’ve not yielded any value from this iterator either. Not yet.

Let’s go a step further and create a third iterator, which is also a generator in this case. You build this third iterator from the second one, second_iter:

#4

The generator iterator third_iter yields the sum of 0.5 and the value yielded by second_iter.

Incidentally, I used a “standard” iterator and two generator iterators in this example. However, for the journey we’re following in this article, it doesn’t matter whether we’re using a basic iterator or a generator iterator. If you prefer, you can repeat this exercise with iterators you get from iter() directly.

Support The Python Coding Stack

Don’t Blink • Follow the Data

You started with a list called boring_numbers. This data structure contains* the data. It’s where the data lives. We’ll be following the data in this section. So it’s important to know where it’s stored!


*Note: Lists, like all data structures, don’t really contain data in the purest sense of the word. See What’s In A List—Yes, But What’s Really In A Python List for more on this. But in general, it’s fine to talk about a list ‘containing’ items of data.


You then create three iterators. The first uses data from boring_numbers. The second iterator uses data from the first. And the third iterator uses data from the second.

But you haven’t tried to fetch any value from any of the iterators yet.

Let’s look at what each iterator is doing at the moment before you fetch any values. The first iterator, first_iter, is pointing at the first item in boring_numbers. It’s ready to read this value and yield it.

The second iterator, second_iter, is pointing at the first item in first_iter. But first_iter doesn’t have any data. Iterators don’t have their own data. But that’s OK. Whenever second_iter needs to fetch the value, it will ask first_iter to fetch and yield its “first” value. I put “first” in quotation marks because you’ll see later that this may or may not be the first value.

Finally, third_iter is pointing at the first item in second_iter. The same logic applies. When third_iter needs the first item, it will ask second_iter for its “first” item, and second_iter will need to ask first_iter for its “first” item. And first_iter is pointing at the first item in the list boring_numbers.

Are you with me? Let’s complicate things a bit…

Note how your code so far includes the following lines:

#5

None of the iterators has yielded any value. For now.

Let’s jumble things up and start by fetching the first value from second_iter:

#6

You ask for the next value in second_iter, which is the first one since you haven’t yielded any values yet.

As you’ve seen earlier, second_iter needs the first value from first_iter. So, behind the scenes, Python calls next(first_iter), which yields the first item from boring_numbers.

So, first_iter reads the first value from boring_numbers, which is the integer 1, and it yields it to second_iter, which then yields the transformed version to the REPL as the return value of next(second_iter). That’s why the output is the float 1.0. The first iterator, first_iter, now moves to point at the second item in boring_numbers, ready for when it’s needed.

Note that boring_numbers doesn’t change in this process. The first item in boring_numbers remains there. It doesn’t disappear.

So far, so good?

Continue in the same REPL session and try the following:

#7

You ask third_iter to give you its “next” value. You haven’t used third_iter anywhere so far. So, you might expect it to yield the “first” value.

And it does.

But its interpretation of what’s the “first” item may be different to what you expect.

Let’s follow the data. When you call next(third_iter), the third iterator asks second_iter for its next item. The second iterator, second_iter, relies on first_iter, so it asks first_iter for its next item. And first_iter, as you may recall, is currently pointing at the second item in boring_numbers, which is the integer 2.

So:

  1. The first iterator first_iter gets the integer 2 from boring_numbers and yields it to second_iter. And first_iter now points at the third item in boring_numbers.

  2. Then, second_iter transforms this value into a float and yields 2.0 to third_iter.

  3. Finally, third_iter adds 0.5 to this value and yields 2.5, which is what you see displayed in the REPL.

When you called next(second_iter) earlier in the code, you used up the first item in second_iter, which in turn used up the first item in first_iter. Since this first value is gone and since third_iter depends on the data yielded by second_iter and first_iter, the earlier call to next(second_iter) also affected the iterator that’s downstream, third_iter.

What will happen if you call next(first_iter) now? Try to follow the data in your head before trying it out or reading on.

.

.

Have you worked it out?

.

.

Let’s run the code:

#8

Although it’s the first time you explicitly use first_iter in your code, you already used two of its values when your code yielded values from iterators downstream. Therefore, the next item in first_iter is the third item in boring_numbers, the integer 3.

Let’s finish with one more expression, still running in the same REPL session:

#9

You call next(third_iter), which asks second_iter for its next item. And second_iter asks first_iter for its next item. At this stage in the process, first_iter is pointing at the fourth item in the original source of data, which is the list boring_numbers. That’s why the output is 4.5.

Independent Iterators

Consider the following code, which is similar to the one you wrote above but has one extra line:

#11

The iterators first_iter and another_first_iter both use the same source of data, boring_numbers. However, they are independent iterators. Note that when you use up some of the elements in first_iter, the independent another_first_iter is not affected. The first time you ask for the first item in another_first_iter, you get the integer 1.

Final Words

Iterators don’t contain data. They rely on data that’s stored elsewhere. But you can have a chain of iterators, each asking the previous one to yield a value. Weird things can happen if you’re not careful. But now you know how to follow the data when you have a chain of iterators.

As a rule of thumb, if you create an iterator that depends on another iterator, you should only use the final iterator to avoid these issues. So, in the example above, you should only yield values from third_iter.

Have a play with this example and make your own chains of iterators, too. And once you’re comfortable with this, get ready to be confused again with my next article, which will discuss itertools.tee()!

And next time you pass by someone in the street offering to let you play the three-cups-and-ball game, don’t feel overconfident because of your iterator knowledge – it won’t help you find the ball.

Code in this article uses Python 3.14

The code images used in this article are created using Snappify. [Affiliate link]

Join The Club, the exclusive area for paid subscribers for more Python posts, videos, a members’ forum, and more.

Subscribe now


For more Python resources, you can also visit Real Python—you may even stumble on one of my own articles or courses there!

Also, are you interested in technical writing? You’d like to make your own writing more narrative, more engaging, more memorable? Have a look at Breaking the Rules.

And you can find out more about me at stephengruppetta.com

Further reading related to this article’s topic:


Appendix: Code Blocks

Code Block #1
boring_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Code Block #2
# ...
first_iter = iter(boring_numbers)
Code Block #3
# ...
second_iter = (float(number) for number in first_iter)
Code Block #4
# ...
third_iter = (num + 0.5 for num in second_iter)
Code Block #5
boring_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
first_iter = iter(boring_numbers)
second_iter = (float(number) for number in first_iter)
third_iter = (num + 0.5 for num in second_iter)
Code Block #6
# ...
next(second_iter)
# 1.0
Code Block #7
# ...
next(third_iter)
# 2.5
Code Block #8
# ...
next(first_iter)
# 3
Code Block #9
# ...
next(third_iter)
# 4.5
Code Block #10
boring_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
first_iter = iter(boring_numbers)
another_first_iter = iter(boring_numbers)
second_iter = (float(number) for number in first_iter)
third_iter = (num + 0.5 for num in second_iter)
next(second_iter)
# 1.0
next(third_iter)
# 2.5
next(first_iter)
# 3
next(third_iter)
# 4.5
next(another_first_iter)
# 1

For more Python resources, you can also visit Real Python—you may even stumble on one of my own articles or courses there!

Also, are you interested in technical writing? You’d like to make your own writing more narrative, more engaging, more memorable? Have a look at Breaking the Rules.

And you can find out more about me at stephengruppetta.com

June 04, 2026 12:50 PM UTC


Real Python

Quiz: How to Read User Input From the Keyboard in Python

In this quiz, you’ll test your understanding of How to Read User Input From the Keyboard in Python.

By working through this quiz, you’ll revisit the input() function, type conversion, error handling with try and except, the getpass module for hidden input, and the PyInputPlus library for automatic validation.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 04, 2026 12:00 PM UTC


Python Software Foundation

PSF Strategic Plan 2026 Draft: Open for Community Feedback

In May, we shared the high-level goals of the Python Software Foundation's (PSF) strategic plan and asked for your commentary. Today we are publishing the full draft and opening a three-week community feedback window.

We welcome you to review the full PSF Strategic Plan Community Draft 2026 document, also embedded below. 

The feedback window closes on June 25, 2026, End Of Day, Anywhere on Earth. The PSF Board will carefully review all input, use it to refine the final version of the strategic plan, and aims to hold a vote to adopt it in a future board meeting.

What's in the full draft

The earlier blog post covered the six organizational goals and four program goals at a high level. The full draft goes deeper: each program goal includes specific strategic objectives, and the organizational goals include tactical ideas the board developed during the planning process. These tactical ideas are starting points for strategic discussion, not commitments.

This is the first post in a short series. Individual board members will share posts that go into specific parts of the plan in more depth. We want the plan to speak for itself, so these posts will draw directly from the document rather than rewriting it.

What we heard at PyCon US

At PyCon US 2026, the PSF Board held its on-site board meeting, with a portion of that time dedicated to strategy. We also discussed the strategic plan at the Members Lunch, a dedicated Open Space session, and in conversations throughout the conference.

The topic of financial sustainability came up repeatedly, and we hear you. The community is waiting for updated financial information, and typically the Members Lunch at PyCon US is where those details are shared. Staffing changes in our accounting functions made that impossible this year. Publishing the full picture is a priority, and we will share an update as soon as we can. The high-level view is that the PSF is stable for now, but we cannot continue on the current path without making meaningful changes. The strategic plan and the PSF's financial outlook are connected, and we understand that context matters. We are committed to being transparent about both.

We also noticed that conversations naturally moved toward implementation ("How will you do this?"). For this feedback round, we are asking you to focus on the direction itself. Are these the right goals? Are the objectives the right ones? Is anything important missing? Implementation will be shaped by PSF staff over time, and there will be opportunities to weigh in on that, too.

How to give feedback

The feedback window closes on June 25th. After that, the board will review all feedback received and decide what changes to make to the strategy document in response. 

Thank you for your time. We’re working on this strategic plan because the Python community deserves a PSF that's deliberate about where it's headed. Your input makes that possible, and we’re grateful for your help.

Jannis Leidel, PSF Board Chair, on behalf of the PSF Board of Directors

June 04, 2026 09:38 AM UTC


Adrarsh Divakaran

Building AI Agents in Python

2026 is shaping up to be a big year for AI agents. We are seeing more products where the AI not only answers a question but also does some work for the user.

You have probably used ChatGPT or a similar AI tool to answer a question, help with writing, or explain some code. You type something, the AI responds, and the conversation goes back and forth. That is powerful, but it is also limited. The AI is essentially stuck in a chat box - it can only talk to you; it cannot do anything on your behalf.

AI agents change that. An agent is an AI that can actually take actions - browse the web, read and write files, run code, call APIs, and more. It does not just answer your question; it works toward a goal, step by step, using whatever tools it needs. Tools like Lovable, Cursor, and Claude Code are examples of this in practice.

In this article, we will explore the concepts behind building an AI agent in Python. We will use the OpenAI Python SDK (Responses API) for the examples, but the same ideas can be generalized to any other LLM SDK. We will use a low-level SDK with minimal abstractions so we can observe and implement most of the agent’s behavior on our end.

TL;DR

This tutorial explains how AI agents work by building a simple one in Python.

We will cover the core pieces: LLMs, prompts, context, memory, the agent loop, tools, MCP, and skills:

Component What it does
LLM Acts as the reasoning engine that understands the user request and decides what to do next.
System prompt Defines the agent’s role, behavior, boundaries, and response style.
Context window Controls how much information the model can see at once, including prompts, history, tool results, and files.
Memory Helps the agent remember useful information across steps or conversations.
Agent loop Repeats the process of thinking, acting, observing results, and deciding the next step.
Tool calling Lets the agent use external functions such as APIs, web search, file access, or code execution.
MCP Provides a standard way to connect agents to reusable tools and data sources.
Skills Package reusable instructions, workflows, examples, and scripts for specific tasks.

What are Agents?

An AI agent is an AI system that can autonomously plan and execute multi-step actions toward a goal.

To understand agents, it helps to first understand what is powering them under the hood - a large language model, or LLM. For example, ChatGPT is a product built on top of OpenAI GPT LLMs. When you type a message and get a response, an LLM is doing the heavy lifting. It takes text as input and generates text as output.

On their own, LLMs are impressive but limited. They can only respond with text. They cannot open your browser, read a file on your computer, or send an email. They also do not know what happened yesterday, because their knowledge comes from training data with a cutoff date, not a live connection to the world.

Agents fix this by giving LLMs access to tools. A tool is just a function your code exposes to the model - something like “search the web” or “read this file.” The model can decide to call a tool when it needs to, and your code actually runs it. This turns a passive text generator into something that can act.

A good way to see the difference is to compare using ChatGPT with using Claude Code for a coding task. With ChatGPT, you describe the problem, copy the suggested code, paste it into your editor, run it, copy the error back, and repeat. The model has no idea what is actually in your project. Claude Code is different - it is powered by an LLM but also has access to tools like bash and file reading. You describe what you want, and it reads your files, writes code, runs tests, and fixes errors on its own. You just watch and steer.

The simplest way to understand an agent is:

  1. The user gives a goal.
  2. The model decides what step to take.
  3. The agent runs that step using a tool.
  4. The model looks at the result.
  5. The process continues until the task is complete.

This is different from a normal chatbot. A chatbot mainly responds. An agent can respond and act.

In a simple agent, the model may only call one tool and return the result. In a more capable agent, the model may make a plan, call multiple tools, observe the results, adjust the plan, and continue until the task is complete.

Before we build this kind of system, we need to choose the model that will drive it.

LLMs

LLMs are trained on massive amounts of text data - entire open source repositories on GitHub, books, articles, websites, and more. Through training, the model learns patterns in language well enough to generate coherent, useful responses. The scale of this training is what makes them surprisingly capable across such a wide range of tasks.

At their core, LLMs are text-in, text-out systems. You send them a block of text (called a prompt), and they generate a response. Everything that happens - reasoning, answering questions, writing code, making decisions - is expressed through that text interface. When an agent calls a tool, it is really the model writing out a structured text request, and your code intercepts that and actually runs the function.

The key limitation to keep in mind: LLMs only know what they were trained on. They have no awareness of events after their training cutoff and no way to look things up in real time unless they are given a tool to do so. This is part of what makes tools so valuable - they extend the model’s reach into the real world.

Choosing an LLM

For an AI agent, the LLM is its brain. The quality of the model affects how well the agent understands instructions, chooses tools, handles errors, and completes multi-step tasks.

At the same time, the most powerful model is not always the right choice. We also need to think about cost, speed, context window, reasoning ability, and where the model is hosted.

Benchmarks

Benchmarks are standardized tests used to compare the performance of different models. For coding tasks, there is SWE-bench. For general reasoning, there is MMLU. Each benchmark tests the model on a specific type of problem and gives it a score. A higher score generally means the model will perform better on that type of task.

Benchmarks are a useful starting point when choosing a model, but they are not the whole story. A model that scores well on a benchmark may still behave unexpectedly in your specific use case, so it is always worth testing with your actual workload.

Costs

Choosing the best-scoring model from a benchmark may not always be the most intelligent decision.

Cost is a real factor, especially at scale. Most providers charge per token, which is the basic unit of text the model processes. A token is roughly four characters, or about three-quarters of a word on average. Both what you send to the model (input) and what it generates back (output) count toward your token usage.

For an agent that runs multiple steps in a loop, token usage adds up quickly. A good approach is to start with a capable model and then see if a smaller or cheaper one can do the same job well enough. Sometimes a smaller model handles simple tasks just fine.

(Model costs table from https://github.com/simonw/llm-prices)

Reasoning Level

Some models are designed to think before they answer. These reasoning models break complex problems into smaller steps internally, often called reasoning traces. You can think of it as the model working through a scratchpad before writing its final response. This can improve performance for tasks that need planning, debugging, tool use, or careful decision-making.

More reasoning effort usually means higher cost, higher response time, and better accuracy for complex tasks.

Not every request needs high reasoning. If the task is simple, we can use a lower reasoning level or a cheaper model. If the task involves multiple steps, unknown errors, or important decisions, more reasoning can be useful.

(Conversation with GPT-OSS LLM showing reasoning/thought traces)

Hosted vs Local

Most people start with a hosted model - one that runs on a provider’s servers and is accessed via an API. These are easy to set up, well-maintained, and generally the most capable options available. The trade-off is that you pay per token, and your data is processed by a third party.

There are also open models that can run entirely on your own machine/server. They can avoid per-token API costs and give you more control over data. The downside is that they require capable hardware and are generally less powerful than the best hosted models today. That said, local models are getting better quickly. Previous generation frontier capabilities are being replicated in the next generation of local models, and this gap will continue to close. Examples of open-weight models that can be self-hosted, depending on hardware and quantization, include Gemma 4 series and Kimi K2.6.

There are already decent local coding models that people use for simple code generation and verification. In the coming years, this will improve, and stronger models will become available on consumer devices.

Hosted models are still easier to use for many applications. They usually provide better quality, higher reliability, larger context windows, and managed infrastructure.

Local models give more control over data, cost, and deployment. But they also require more setup, hardware, monitoring, and optimization.

Configuring the LLM

Once you have picked a model, there are two things you set up before the agent starts running: the system prompt and the context window.

System Prompt

A system prompt is the model’s top-level instruction that guides its behavior during a conversation.

It can set rules such as:

For an agent, the system prompt is very important. It tells the model how to behave while using tools. It can also define boundaries, such as asking for permission before destructive actions or avoiding actions outside the user’s request.

Let’s see an example of this in practice:

import os

if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {
                "role": "system",
                "content": "You are a friendly Python tutor. Refuse all requests unrelated to Python coding",
            },
            {
                "role": "user",
                "content": input("Enter your Python question: "),
            },
        ],
    )
    print(response.output_text)

In the above script, we initialize an OpenAI client and use client.responses.create to send a message to gpt-5.4-mini model. The system prompt is specified in the input list as the first entry. "role": "system" designates the entry as the system prompt. In the above example, the model is instructed to act as a Python tutor and refuse requests unrelated to Python. As the next entry, we accept the user prompt via input() and pass it to the LLM for answering.

If the script is run and any unrelated queries are passed to the LLM, we get a refusal response similar to the below one:

Enter your Python question: How many states are there in the US?

Model response: I’m here to help with Python coding questions only. If you have a Python-related question, feel free to ask!

Even though the underlying large language model knows the answer to the user’s query, it refuses to answer as per direction in the system prompt.

Context Window

The context window is the model’s working memory. It is the amount of information the model can see in one request.

The context can include the user message, conversation history, system prompt, tool results, files, documentation, and any other information we provide.

Most of the latest flagship models support up to 1M tokens, which is roughly 750,000 words or about 15 books. Older models like GPT-4 series models had a 128K token window, around 2 books’ worth. For agents that run long tasks or work with large documents, context window size matters a lot. When the context fills up, older information gets dropped, which can cause the agent to lose track of earlier steps in a long task.

A larger context window is useful, but it is not free. More context usually means more cost and slower responses. Also, just because a model can accept a lot of context does not mean every token is equally important.

Good agents manage context carefully. They include what is needed, summarize old information, and avoid filling the context with unnecessary data.

Once we understand the model and its context window, the next question is what the agent should remember across steps and conversations.

Memory

Memory helps an agent remember useful information.

Short-term memory helps the agent remember what the user said earlier in the same conversation. This usually lives inside the context window.

Let’s consider an example. The snippet below accepts a user query inside a loop and sends it to a model to get the response:

import os

if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    while True:
        user_query = input("You: ")

        if user_query.lower() in ["exit", "quit"]:
            break

        response = client.responses.create(
            model="gpt-5.4-mini",
            input=user_query,
        )

        assistant_reply = response.output_text
        print(f"Model: {assistant_reply}")

The code works, but there are issues:

You: Tell me about Taj Mahal in 1 sentence
Model: The Taj Mahal is a magnificent white marble mausoleum in Agra, India, built by Emperor Shah Jahan in memory of his wife Mumtaz Mahal, and is one of the world’s most famous symbols of love.

You: When was it built?
Model: I can help, but I need to know **what “it” refers to**.  
Please share the name, photo, or location of the building/structure/object, and I’ll tell you when it was built.

As seen from the transcript, the model fails to answer the user’s follow-up prompt. This is because, we did not implement short term memory. For the model to be able to respond to follow-ups properly, we need to store and pass the conversation history to LLM calls. The snippet improves on the above script with short term memory implementation:

import os
if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    conversation_history = []

    while True:
        user_query = input("You: ")

        if user_query.lower() in ["exit", "quit"]:
            break

        conversation_history.append({
            "role": "user",
            "content": user_query,
        })

        response = client.responses.create(
            model="gpt-5.4-mini",
            input=conversation_history,
        )

        assistant_reply = response.output_text
        print(f"Model: {assistant_reply}")

        conversation_history.append({
            "role": "assistant",
            "content": assistant_reply,
        })

We introduced a conversation_history list that stores previous messages. User messages are appended to this list with "role": "user" and model responses are appended with "role": "assistant". This way, whenever a request is sent to the model, it gets the entire message history through the input argument and will be able to respond to follow-up prompts correctly.

You: Tell me about Taj Mahal in 1 sentence
Model: The Taj Mahal is a stunning white marble mausoleum in Agra, India, built by Emperor Shah Jahan in memory of his wife Mumtaz Mahal.

You: When was it built?
Model: It was built between 1632 and 1653.

Long-term memory stores information beyond one conversation and persists even after the current chat or task ends. This is useful when you want the agent to remember user preferences, past decisions, or domain-specific facts across sessions. Common approaches include RAG (retrieval-augmented generation), where relevant information is fetched from a database and added to the context as needed, and built-in memory systems like ChatGPT Memories, where key facts are stored and automatically recalled in future conversations.

Agent Loop

The agent loop is the core flow of an agent.

A simple loop looks like this:

  1. User sends a message.
  2. Agent adds the message to the conversation context.
  3. Agent sends the context and system prompt to the LLM.
  4. LLM decides what to do next.
  5. If needed, the LLM calls a tool.
  6. Agent runs the tool and sends the result back to the LLM.
  7. LLM decides whether more steps are needed.
  8. When done, the LLM generates the final response.
  9. Agent sends the response to the user.

This loop is what makes agents feel different from normal chatbots. A chatbot usually gives one response. An agent can act, observe, and continue.

In practice, the intermediate steps are where the interesting work happens. The model may call a tool, wait for the result, process that result, decide to call another tool, and keep going before it gives a final answer. The loop runs as many times as needed until the model decides the task is complete or the user stops it. This brings us to tools - what they are and how they actually work.

Tool Calling

Tools are external capabilities that the agent can use.

Tools (also called functions) let an AI agent do things beyond generating text. They can be used to take actions or get information.

Examples of tools:

The agent chooses a tool when needed. The tool has a name, a description, and input parameters. The model decides which tool to call and what arguments to pass.

Tool descriptions are important. If a tool description is unclear, the model may call it at the wrong time or pass the wrong input. We should describe tools in simple language and make their inputs strict.

Here is an important detail: the model does not run the tool itself. When it decides to use a tool, it outputs a structured request with the tool name and the arguments it wants to pass. Your code intercepts this, runs the actual function, and passes the result back to the model. The model then reads the result and decides what to do next. This back-and-forth between the model and your code is what makes the agent loop so powerful.

Let’s see an example of tool calling in action:

import json
import os
from dotenv import load_dotenv

load_dotenv()


def get_weather(location):
    return {
        "location": location,
        "temperature": "24 C",
        "condition": "Sunny",
        "humidity": "52%",
        "wind": "11 km/h",
    }


if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    tools = [
        {
            "type": "function",
            "name": "get_weather",
            "description": "Get the current weather for a destination.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city or destination, e.g. Paris or Tokyo",
                    }
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        }
    ]

    input_list = [
        {
            "role": "system",
            "content": "You are Safar, a travel planning AI agent",
        },
        {
            "role": "user",
            "content": input("Ask you travel questions: "),
        },
    ]

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=input_list,
        tools=tools,
        tool_choice="required",
    )

    print("The model responded with:")
    print(response.output)

    input_list += response.output

    for item in response.output:
        if item.type != "function_call":
            continue

        if item.name == "get_weather":
            args = json.loads(item.arguments)
            print(f"The model wants to call get_weather with: {args}")

            weather = get_weather(args["location"])
            print(f"The local Python function returned: {weather}")

            input_list.append({
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(weather),
            })

    print("Sending the tool result back to the model")
    final_response = client.responses.create(
        model="gpt-5.4-mini",
        input=input_list,
        tools=tools,
    )

    print("Final answer:")
    print(f"Model response: {final_response.output_text}")

In the tools list, we have defined a function named get_weather according to OpenAI function calling guidelines and have specified the parameters that the model accepts using the parameters key. This definition follows JSON Schema specification.

Since, we add this tools list when making calls to OpenAI, the model will know that it has access to a weather tool and will be able to request a function call when needed.

In the script, you can see that when we receive a response from the model, we always check if the response type is a function call or not (item.type != "function_call") and if the response is a request to call get_weather tool, we call the get_weather() Python function and send it back to the model:

weather = get_weather(args["location"])
input_list.append({
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(weather),
            })

Let’s run the script and ask the agent a question that would require a weather tool call:

Ask you travel questions: Sunscreen needed in Goa?

The model responded with:
[ResponseFunctionToolCall(arguments='{"location":"Goa"}', call_id='call_X9OBZhGwT3yhfmTAOclefWE8', name='get_weather', type='function_call', id='fc_05ba95ec38f46f7f006a17ce9e3bb0819a9a0b430001f7bd91', namespace=None, status='completed')]

The model wants to call get_weather with: {'location': 'Goa'}
The local Python function returned: {'location': 'Goa', 'temperature': '24 C', 'condition': 'Sunny', 'humidity': '52%', 'wind': '11 km/h'}

Sending the tool result back to the model

Final answer:
Model response: Yes — sunscreen is a good idea in Goa. It’s sunny there right now, so UV exposure can be strong even if it feels pleasant.

Quick tips:
- Use broad-spectrum SPF 30+ (SPF 50 if you’ll be at the beach a lot)
- Reapply every 2 hours, and after swimming/sweating
- Don’t forget ears, neck, hands, and feet
- A hat and sunglasses help too

If you want, I can also suggest a Goa beach-day packing list.

For our query, the model initially responds with a ResponseFunctionToolCall item. This requests our get_weather function to be called with location argument set as Goa.

Responding to this request, our script executes the function call and sends the function call response back to the model for getting the final response. The function call always returns temperature as 24 degree Celsius with condition as sunny. Trusting this data, the model produces its final response, suggesting the user to use a sunscreen.

The weather function defined in the above script is not a very useful one, it returns a hardcoded weather data for all requests. In a practical scenario, the function should make an actual call to a real Weather API to fetch data.

The above script illustrates the concept of an agent loop. Even though the example involves just one user request and model response, the agent takes intermediary steps (tool calls) before returning the final response.

Now let’s move to a real world example involving tools. We will provide web search capability to our agent by defining a custom SerpApi web search tool.

Providers usually have their own built-in tools for web search. However, these tools can be slow or unreliable at times. To get live search data from search engines reliably, we can write a custom tool/function using SerpApi Python SDK.

import json
import os


def google_search(query):
    import serpapi

    client = serpapi.Client(api_key=os.environ["SERPAPI_KEY"])

    results = client.search({
        "engine": "google",
        "q": query,
    })

    return [
        {
            "title": result.get("title"),
            "link": result.get("link"),
            "snippet": result.get("snippet"),
        }
        for result in results.get("organic_results", [])[:5]
    ]


if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    tools = [
        {
            "type": "function",
            "name": "google_search",
            "description": "Search Google with SerpApi and return web search results.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The Google search query to run",
                    }
                },
                "required": ["query"],
                "additionalProperties": False,
            },
            "strict": True,
        }
    ]

    input_list = [
        {
            "role": "system",
            "content": "You are Safar, a travel planner. Use Google search when current destination information would improve your answer.",
        },
        {
            "role": "user",
            "content": input("What travel question should I research? "),
        },
    ]

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=input_list,
        tools=tools,
        tool_choice="required",
    )

    print("The model responded with:")
    print(response.output)

    input_list += response.output

    for item in response.output:
        if item.type != "function_call":
            continue

        if item.name == "google_search":
            args = json.loads(item.arguments)
            print(f"The model wants to call google_search with: {args}")

            search_results = google_search(args["query"])
            print(f"Step 7: SerpApi returned {len(search_results)} search results")

            input_list.append({
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(search_results),
            })

    final_response = client.responses.create(
        model="gpt-5.4-mini",
        input=input_list,
        tools=tools,
    )

    print(f"Model response: {final_response.output_text}")

Here, we define a google_search() that accepts a query and performs a Google search with the query using SerpApi Python SDK. The function returns the first five search results obtained from Google.

Let’s see the results in action:

What travel question should I research? When is the Tomato festival - La Tomatina happening this year?

The model responded with:
[ResponseFunctionToolCall(arguments='{"query":"La Tomatina 2026 date official"}', call_id='call_mk2KL4xnvR0mexyt2lXFTHgE', name='google_search', type='function_call', id='fc_01af4c5fc07e8479006a192316ab20819bb10273439c89fb9a', namespace=None, status='completed')]

The model wants to call google_search with: {'query': 'La Tomatina 2026 date official'}
Step 7: SerpApi returned 5 search results

Model response: La Tomatina is happening on **Wednesday, August 26, 2026** in **Buñol, Spain**.

If you want, I can also help with:
- tickets
- how to get there from Valencia
- where to stay nearby

This is the core idea behind tool calling. The model does not directly browse the web or fetch data by itself. Instead, it identifies when a tool is needed, asks for that tool to be called, and then uses the returned result to continue the conversation. This separation is useful because the model can focus on reasoning, while tools provide access to external systems and real-time information.

Without the google_search tool, the model would not be able to answer questions that require live data. It should respond with something like: “I don’t have access to real-time information.” By defining the tool, we give the model a safe and structured way to request the information it needs.

MCP

As you build more agents with more tools, a new problem emerges: every tool integration is custom-built and cannot easily be reused elsewhere. If you build a GitHub integration for one agent, you would have to rebuild it from scratch for another. That is where MCP comes in.

Model Context Protocol (MCP) is like USB-C for AI integrations. It is a standard protocol that lets models connect to external tools and data sources in a consistent, reusable way. Instead of building a custom integration for every tool, you write an MCP server once, and any model that supports MCP can use it.

Examples include:

With MCP, the model can discover supported functionality and call tools when needed. This makes integrations reusable across different models, clients, and applications. For a small agent, normal tool calling may be enough. For larger systems with many integrations, MCP can make the architecture cleaner.

Let’s see an example of MCP usage in practice. The script below uses the SerpApi MCP server - using this, the agent will be able to call all the SerpApi supported engines like google, google_shopping, amazon, etc.

import os

if __name__ == '__main__':
    from openai import OpenAI

    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    serpapi_mcp_url = f"https://mcp.serpapi.com/{os.environ['SERPAPI_KEY']}/mcp"

    response = client.responses.create(
        model="gpt-5.4",
        tools=[
            {
                "type": "mcp",
                "server_label": "serpapi",
                "server_description": "SerpApi MCP server",
                "server_url": serpapi_mcp_url,
                "require_approval": "never",
            }
        ],
        input=[
            {
                "role": "system",
                "content": "You are Cartwise, a shopping assistant. Help users compare products, prices, reviews, and buying options.",
            },
            {
                "role": "user",
                "content": input("What do you want to shop for? "),
            },
        ],
    )

    print("Full model response (includes MCP operations): ")
    print(response.output)

    print(f"Model response: {response.output_text}")

SerpApi exposes the MCP server via the URL https://mcp.serpapi.com/. Users can supply the API Key via the URL path as seen in the example: https://mcp.serpapi.com/{os.environ['SERPAPI_KEY']}/mcp.

The code here is relatively simpler compared to the tool calling example. We just need to provide the MCP server info via the tools argument:

tools=[
            {
                "type": "mcp",
                "server_label": "serpapi",
                "server_description": "SerpApi MCP server",
                "server_url": serpapi_mcp_url,
                "require_approval": "never",
            }
        ]

From this definition alone, the model can discover supported MCP functionalities and it will be able to autonomously call the MCP server tools based on the user request.

Let’s ask the agent a shopping query. Here, I am asking it to find the price of a mobile device:

What do you want to shop for? Find best price for Moto Razr Ultra phone

Full model response (includes MCP operations): 
[ 
  McpListTools(id='mcpl_06176a7178fb9f5c006a17f6c23578819ab2c977e7bc2b0bc7', server_label='serpapi', ..., 

McpCall(id='mcp_06176a7178fb9f5c006a17f6c40930819aac6136e6a0f0ced8', arguments='{"params":{"q":"Moto Razr Ultra phone price","engine":"google_shopping","num":10},"mode":"compact"}', name='search', server_label='serpapi', type='mcp_call', approval_request_id=None, error=None, output='{"shopping_results": [{"position": 1, "title": "Motorola Razr Ultra 2025", "product_id": "14521999409488109662", "product_link": ...]}]}', status='completed'), 
  ResponseOutputMessage(id='msg_06176a7178fb9f5c006a17f6cc3e68819aa688012defa9cf78', content=[ResponseOutputText(annotations=[], text='Best price I found for a **new Moto Razr Ultra** is:...'
]

Model response: Best price I found for a **new Moto Razr Ultra** is:

- **$699.99 at Best Buy** — Motorola Razr Ultra 2025  
  - was **$1,300**
  - rating: **4.0/5** from **520 reviews**
  - free delivery by Sat

Also matching:
- **$699.99 at Motorola US** — Motorola Razr 2025
- **$764.00 at Etoren** — Motorola Razr 50 Ultra
- **$1,049.99+** for some Razr 60 Ultra / 2026 variants

The model response includes a series of operations:

If we omitted the SerpApi MCP definition in the above script, the model should have responded with something like: “I cannot access real-time prices.” This is because the model itself does not have live data access unless we explicitly connect it to external tools or systems. MCP is one way to provide that connection in a standard way.

Now that we have seen how MCP connects agents to external capabilities, let’s look at another way to extend agent behavior: skills.

Skills

While tools handle actions, skills handle behavior. A skill is a reusable set of instructions or a workflow that tells an agent how to perform a specific type of task well.

We have seen tools and MCP which are code-heavy. Tools are code that gets called by the model whereas MCP requires a server implementation according to the Model Context Protocol spec. Skills are relatively simple and can just be a plain text markdown file.

A skill can include:

Skills are useful for repeated tasks. Examples include writing reports, analyzing PDFs, creating slides, debugging code, or handling customer support. Skills make agents more specialized.

Instead of putting every instruction into the system prompt, we can use skills where the model receives just the skill metadata in the context and will be able to load and use the full skill when the current task needs it.

A skill file is just a markdown file with the below format:

---
name: skill-name
description: A description of what this skill does and when to use it.
---
Skill contents in markdown

Let’s see a real-world example: The SerpApi Search Skill provides instructions for the agent to interact with SerpApi realtime search APIs. You can see the skill.md file, which provides instruction to the model to invoke various SerpApi API calls.

You can see a usage example below, where we use SerpApi skill to build a travel planning agent.

import os
import subprocess
from pathlib import Path

from openai import OpenAI


MODEL = os.getenv("OPENAI_MODEL", "gpt-5.4-mini")
SKILL_PATH = Path(__file__).resolve().parent / "skills" / "serpapi-web-search"


def run_shell_call(shell_call):
    print(f"\nModel requested shell call: {shell_call.call_id}")
    print(f"Commands: {shell_call.action.commands}")

    command_outputs = []
    for command in shell_call.action.commands:
        print(f"\n[script] Running command: {command}")

        result = subprocess.run(
            command,
            shell=True,
            executable="/bin/zsh",
            capture_output=True,
            text=True,
            check=False,
        )

        print(f"[script] Exit code: {result.returncode}")
        if result.stdout:
            print(f"[script] stdout:\n{result.stdout[:1500]}")
        if result.stderr:
            print(f"[script] stderr:\n{result.stderr[:1500]}")

        command_outputs.append({
            "stdout": result.stdout,
            "stderr": result.stderr,
            "outcome": {
                "type": "exit",
                "exit_code": result.returncode,
            },
        })

    output_item = {
        "type": "shell_call_output",
        "call_id": shell_call.call_id,
        "output": command_outputs,
    }

    if shell_call.action.max_output_length is not None:
        output_item["max_output_length"] = shell_call.action.max_output_length

    return output_item


if __name__ == "__main__":
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    input_list = [
        {
            "role": "system",
            "content": "You are Safar, a travel planning assistant.",
        }
    ]

    tools = [
        {
            "type": "shell",
            "environment": {
                "type": "local",
                "skills": [
                    {
                        "name": "serpapi-web-search",
                        "description": "Search current travel information with the SerpApi CLI.",
                        "path": str(SKILL_PATH),
                    }
                ],
            },
        }
    ]

    print("Type 'exit' or 'quit' to stop.\n")

    waiting_for_user = True

    while True:

        if waiting_for_user:
            user_query = input("You: ")

            if user_query.lower() in ["exit", "quit"]:
                break

            input_list.append({
                "role": "user",
                "content": user_query,
            })
            waiting_for_user = False

        print("\n[script] Sending request to the model.")
        response = client.responses.create(
            model=MODEL,
            input=input_list,
            tools=tools,
        )

        input_list += response.output

        shell_calls = [item for item in response.output if item.type == "shell_call"]
        print(f"[script] Shell calls requested: {len(shell_calls)}")

        if not shell_calls:
            print(f"Model response: {response.output_text}\n")
            waiting_for_user = True
            continue

        for shell_call in shell_calls:
            input_list.append(run_shell_call(shell_call))

        print("\n[script] Sending shell output back to the model.")

The script uses local skills capability of OpenAI SDK - we have the skill files added in skills/serpapi-web-search folder relative to the scripts parent directory.

Skills can be specified using the below format:

tools = [
        {
            "type": "shell",
            "environment": {
                "type": "local",
                "skills": [
                    {
                        "name": "serpapi-web-search",
                        "description": "Search current travel information with the SerpApi CLI.",
                        "path": str(SKILL_PATH),
                    }
                ],
            },
        }
    ]

We provide the skill name, description and path to the agent. When using skills, OpenAI SDK will emit shell calls that must be run in the terminal. This is needed so that the agent can list and view the full skill file contents that are present locally. We have a run_shell_call() function defined for this. Whenever the model requests for a shell call, we will run this function and pass back the shell results to the model.

Since this example lets the model request shell commands, only run it in a trusted, sandboxed environment. Do not give shell access to untrusted prompts, repositories, or skill files without review.

Now let’s run the agent and ask it a travel planning question. We will ask the model about hotel prices in Goa, India:

Type 'exit' or 'quit' to stop.
You: Find Goa hotel prices for a vacation: two nights from 10 June 26

[script] Sending request to the model.
[script] Shell calls requested: 1

Model requested shell call: call_Cgs0D4tNOZFPZ30GNVJzJNHZ
Commands: ['cd .../skills/serpapi-web-search && cat SKILL.md']

[script] Running command: cd .../skills/serpapi-web-search && cat SKILL.md
[script] Exit code: 0
[script] stdout:
---
name: serpapi-web-search
description: >-
  Search the web using SerpApi's 100+ search engines. Use this skill whenever
  the user needs current or web-sourced information: ...

[script] Sending shell output back to the model.

[script] Sending request to the model.
[script] Shell calls requested: 1

Model requested shell call: call_Y2PGcBkpSEkzIikqu1H30uRW
Commands: ["cd .../skills/serpapi-web-search && sed -n '1,220p' rules/ENGINES.md"]

[script] Running command: cd .../skills/serpapi-web-search && sed -n '1,220p' rules/ENGINES.md
[script] Exit code: 0
[script] stdout:
# SerpApi Search Engines Catalog

Complete list of 107 SerpApi search engines. Use the `engine` parameter to select the desired search engine.
...

[script] Sending shell output back to the model.

[script] Sending request to the model.
[script] Shell calls requested: 1

Model requested shell call: call_oncTqSWJWQoDbqJS06aJP1OD
Commands: ['cd .../skills/serpapi-web-search && serpapi search engine=google_hotels q="Goa" check_in_date=2026-06-10 check_out_date=2026-06-12 --fields \'properties[0:20]\'']

[script] Running command: cd .../skills/serpapi-web-search && serpapi search engine=google_hotels q="Goa" check_in_date=2026-06-10 check_out_date=2026-06-12 --fields 'properties[0:20]'
[script] Exit code: 0
[script] stdout:
{
  "properties": [
    {
      "type": "hotel",
      "name": "Vivanta Goa, Miramar",
      "description": "Swanky hotel across from the beach featuring a rooftop pool, a stylish restaurant & a 24-hour gym.",
      ...


[script] Sending shell output back to the model.

[script] Sending request to the model.
[script] Shell calls requested: 0

Model response: I found live Goa hotel prices for **2 nights: 10 Jun 2026 to 12 Jun 2026**.

Sample prices from Google Hotels:
- **Vivanta Goa, Miramar** — **$82/night**, **$164 total**
- **Aloha Holiday Resort** — **$17/night**, **$33 total**
- Other properties in the Goa search were showing a wide range, from budget stays to luxury resorts.

A few notes:
- These are **current live rates** and can change quickly.
- The prices shown are from hotel search results and may be **before taxes/fees**.
- I searched broadly for **Goa**; if you want, I can narrow it down by:
  - **North Goa / South Goa**
  - **Budget / mid-range / luxury**
  - **Beachfront**
  - **2 adults vs family**

If you want, I can make a short list of the **best 10 Goa hotels under a budget you choose**.

As seen from the output, the model initially requested a shell call that runs cat SKILL.md which is to read the skill contents.

With the skill contents obtained, the model proceeds with another shell call sed -n '1,220p' rules/ENGINES.md which lists all SerpApi supported engines. With this data, the model will be able to get all supported SerpApi search engines and choose the best one for the task.

Next, model requests running the command serpapi search engine=google_hotels q="Goa" check_in_date=2026-06-10 check_out_date=2026-06-12 --fields 'properties[0:20]' which uses SerpApi CLI to get results from Google Hotels. We run this shell command on our end and pass the results to the model that includes JSON results from Google Hotels API.

With this data obtained, the model was able to generate its final response and give us suggestions for Hotels to book in Goa along with the prices.

Now that we have seen prompts, memory, tools, MCP, and skills, we can put these pieces into one simple stack.

Agent Capability Stack

An agent can be understood as a stack of capabilities. We have seen the core building blocks of an agent: system prompts, tools, MCP, and skills. Now, let’s compare how they fit together in the agent capability stack.

At the bottom, we have the system prompt. This defines global behavior and constraints.

Then we have skills. Skills provide packaged procedures for specific task types.

Then we have tools. Tools let the agent do things in the world.

Then we have MCP. MCP gives us a standard way to connect models to tools, files, APIs, databases, IDEs, browsers, and other systems.

We can think about the stack like this:

Layer Purpose Use when
System prompt Global behavior and constraints You want rules that apply every turn
Skills Reusable workflows You want the model to follow a repeatable process
Tools External actions and information You want the model to call APIs, read files, run code, or fetch live state
MCP Standard integration layer You want reusable integrations across models and clients

Use a system prompt for safety boundaries, tone, refusal style, and stable rules.

Use a skill when you want the model to follow a repeatable workflow or use scripts and templates.

Use tools when the model must call external services, fetch live state, create side effects, or interact with the environment.

Use MCP when you want to expose tools and resources through a standard protocol.

Summary and Next Steps

In this tutorial, we started out with the components of an AI agent and built a few simple agents for use cases such as shopping and travel. We provided capabilities to agents using tool calling, MCP, and Skill files.

To explore on your own, you can find the code snippets used in the tutorial in this GitHub repo.

If you are looking for a different SDK or tool to start with like the Claude agent SDK or n8n, we have you covered:


Even though we covered the basics for building simple agents, some important next steps to learn more about are:

A multi-agent system has multiple agents, where each agent can be specialized for a specific goal. These agents can communicate with each other. We can also have verifier models that check the output from other models.

Similar to building a backend application, we need observability and error handling for agents. The model can hallucinate, choose the wrong tool, pass bad arguments, or get stuck in a loop. We need a way to monitor this behavior and improve the system over time.

Permissions are also important. An agent that can read files is useful. An agent that can delete files or send emails should be more carefully controlled. We should decide which actions require user approval.

Context compaction is another important idea. As the conversation grows, the agent cannot keep everything forever. It needs to summarize old information and keep only what is useful for the next step.

Evaluation helps us understand whether the agent is actually doing a good job. We can test the agent on sample tasks, check if it used the right tools, verify whether the final answer is correct, and compare outputs across different prompts or models. Without evaluation, it is hard to know if the agent is improving or just producing confident-looking answers.

The best way to understand agents is to build small ones, give them real tasks, inspect their tool calls, and evaluate their outputs. Start with a simple loop, add tools carefully, introduce memory only when needed, and add observability before trusting the agent with important actions. And if your agent needs real-time data access, you can explore SerpApi APIs to extend its capabilities.

June 04, 2026 06:50 AM UTC


Core Dispatch

Core Dispatch #5

Welcome back to Core Dispatch! This edition covers May 18 through June 4, 2026. As promised, Python 3.15.0 beta 2 landed on June 2. Two more milestones are close behind: 3.13.14 and 3.14.6 on June 9, followed by 3.15.0 beta 3 on June 23.

There's also a healthy batch of changes landing for 3.15: an O(n^2) blowup in unicodedata.normalize() was fixed, the XML parser gained support for multi-byte encodings, and a round of deprecation warnings went in for the ast module and abc's abstractclassmethod/abstractstaticmethod/abstractproperty.

On the project side, the Python Security Response Team (PSRT) landed an initial Python security policy in the Devguide, giving the vulnerability reporting and response process a documented home. And dev builds of 3.15+ now report a version like 3.15.0b2+dev instead of the old bare-plus 3.15.0b2+, which wasn't PEP 440-compliant.

Looking ahead, the EuroPython 2026 Language Summit topics are out, with a lineup spanning a Rust-for-CPython roadmap, the future of free-threading, garbage collection, and the buffer protocol.

If you're interested in CPython internals, Victor Stinner has a great writeup on free threading internals and reference counting that's well worth your time.

As always, if you maintain a package or just like living on the edge, give the latest 3.15 beta a spin and file any issues you find.

Upcoming Releases

Official News

PEP Updates

Merged PRs

Discussion

Core Dev Musings

Upcoming CFPs & Conferences

Community

One More Thing

""TBC" is "to be confirmed" for Pablo's [Language Summit talk]?"

— Gregory Smith

"The Banana Council 🍌"

— Donghee Na

Credits

June 04, 2026 12:00 AM UTC


PyCon Ireland

CFP Deadline Moved to 31 July 2026

We’ve moved the deadline for the PyCon Ireland 2026 Call for Proposals forward to 31 July 2026. It was previously set to 30 August. The submission page on Sessionize already reflects the new date.

If you were planning to submit, please get your proposal in by 31 July 2026.

Why We Brought the Deadline Forward

There are two reasons behind this change, and both are about giving people the time they need to do things well.

Giving the programme committee room to review properly

Building a great schedule takes careful work. With dozens of proposals to read, discuss, and compare, the programme committee needs enough time to give every submission a fair and thorough review. Closing the CFP at the end of August left a tight window between the deadline and the point where we have to lock in the schedule. By moving to 31 July, we give the committee the breathing room to evaluate each proposal on its merits, balance the tracks, and make thoughtful decisions rather than rushed ones.

Giving speakers time to plan their trip to Dublin

PyCon Ireland brings speakers from across Ireland and beyond. Travelling to Dublin means booking flights, arranging accommodation, sorting out time off, and sometimes applying for visas or financial aid. The sooner we can confirm accepted talks, the sooner speakers can start planning, and the less stressful and less expensive that planning tends to be. An earlier deadline means earlier notifications, which is better for everyone making the journey.

What This Means for You

Ready to Submit?

Head over to our proposal submission page and tell us what you’d like to talk about. If you have any questions, reach out at contact@python.ie.

We can’t wait to read your proposals, and we’re looking forward to seeing you in Dublin on 17 October 2026.

June 04, 2026 12:00 AM UTC


Bob Belderbos

"Rust Is for People Who Want to Be Punished." Now Jochen Trusts It More Than Python.

Jochen Deister is a lawyer who codes for fun. He has years of Python behind him and no intention of ever being hired to program.

Three months ago, Rust was just a name to him, the language for "the big shots" with a notoriously steep learning curve. Then he built a JSON parser from scratch in Rust, and it ran faster than the equivalent in Python on every dataset he tested, up to 3.5x faster on some. "Holy F" he reacted when he saw the results.

Six weeks of work produced:

Here's how it happened.

The gap

Jochen learned to code on a Commodore VIC-20 with six kilobytes of RAM, then a C64, then a stint in assembly and Turbo Pascal when the bottleneck moved from memory to speed.

Then life took him into law and academia, and he forgot all of it until he picked Python back up years ago.

Python suited him, but it hid the machine. "Python abstracts a lot of these concepts away" he said. "It hides the mechanics".

He'd heard Rust had a notoriously steep learning curve, and he was doing this for fun. "Rust is for people who want to be punished in their life" he figured, and left it there.

The trigger that changed it was small: the last Pybites podcast episode, a $49 lifetime offer on our Rust practice platform, and a remote cabin on the Danish coast where his only job was to keep his kids fed during exam season.

He finished all 61 platform exercises, third on the leaderboard, then shortly after signed up for the cohort for a deeper challenge.

The platform taught him the vocabulary. What it couldn't give him was a real project with a coach reading his code in detail. That's what our cohort is about: six weeks building a JSON parser, one PR review a week with Jim Hodapp, expert Rust coach.

The constraints stopped feeling like constraints

Most people describe their first weeks of Rust as a fight with the borrow checker; the compiler rule that tracks who owns each value and won't let two parts of your code modify the same data at once. Jochen didn't feel it this way at all.

"I never had the feeling that I was fighting the borrow checker. The error messages were my friends right out of the gate. They had a good explanation of the error, but also a hint about what you could do differently."

What hooked him was aesthetics. Run the formatter on a chain of iterator steps and each transformation lands on its own line, readable top to bottom.

"Rust is a beautiful language. It's an aesthetic language. It looks good, and working toward more beautiful code was really something I liked."

That pulled him toward idiomatic Rust on its own. He stopped wanting code that merely worked, the bar he'd accepted in Python, and started wanting code that was safe, performant and idiomatic.

He broke his own code on purpose

Week five, PyO3, was the real step up. PyO3 is the bridge that lets you call a Rust module straight from Python, the same layer Pydantic and Polars are built on. It was the first concept the practice platform hadn't prepared him for, so he leaned on the implementation steps and went slowly.

The clearest sign of how his thinking changed came in the final week. Three of his four benchmark datasets were already beating Python; one wasn't. He suspected the parser was copying the entire input onto the heap instead of borrowing it. So he changed the entry point to take a borrowed string with an explicit lifetime (a lifetime is Rust's way of letting you reference data without copying it, while proving the reference can't outlive the data) and ran cargo check.

It reported 78 errors.

"Those 78 errors were my path of what I needed to fix to get to the results. You change something up the chain and 78 reduces to 50, and so on down the line. It is your implementation guide."

He'd deliberately broken the code, then followed the compiler error by error back to a working, faster version. It's like having a 200% test suite for free; you feel confident making changes.

The rewrite turned a parser that collected every token into a list up front into one that reads tokens on demand in a single pass. Jim's note on the PR: "This is such a clean functional style API for your tokenizer, it's evolved and matured nicely".

The profiler told him where he was wrong

Speed in Rust isn't automatic, and Jochen learned that the hard way. He'd swapped a list for a double-ended queue, proud of it.

"Two days later, when I looked at the profiler, that very line that I was so proud of was now by far the biggest offender."

A profiler measures where a program actually spends its time, so you optimize the real bottleneck instead of a guessed one. His showed the standard-library hash map dominating. He read the docs, realized that map carries protection against denial-of-service attacks he'd never need in a local command-line tool, and replaced it with a stripped-down one. Data-driven, one commit at a time, until the last dataset crossed the line.

Through all of it he kept AI out of the code on purpose. He used it to make himself learn faster, NotebookLM turning Rust docs into podcasts and flashcards, never to write a solution. "Only I write the code" is the rule he gives his AI mentor.

What changed

Ask him how confident he feels starting a new Rust project and he says a 3 out of 10, and means it as a compliment to the language.

"I'm not a total noob anymore. I have a rough understanding of the key concepts, but I also know there's a heck of a lot to learn."

The transfer is in the habits. Rust is now his default for new projects, he caught himself skipping the Python newsletters to read about Rust instead, and the deliberate, idiomatic thinking followed him back into his Python. After years in the Python community, his loyalty quietly shifted:

"I've always liked Python. But it's changed in a way that I think I like Rust more, because of its honesty and because it forces you to think stricter."

His favorite piece of the language is pattern matching, the construct that lets you branch on the shape of a value and pull data out of it in one move. He went deep enough that he used a binding trick his coach hadn't seen before, matching and naming a value in the same arm. Jim's reply on the PR:

"You taught me something I didn't realize Rust has. It's a nice match-and-bind pattern that saves boilerplate code."

The reason he loves it is the same one running through everything he said:

"Computer languages need to be beautiful."

Next up for Jochen: porting a coding agent from Python to Rust, and a privacy tool that strips personal data out of text before it reaches an LLM.

For someone who started three months ago thinking Rust was punishment, that's a real shift. (For more on how Rust rewires the way you write Python, see Learning Rust Made Me a Better Python Developer.)

Here is our full conversation with Jochen about his cohort experience, the parser he built, and the performance work he did:

Watch on YouTube


If you're a Python developer wanting to reach a new level in your career, Rust is a strong contender. Book me in for a call and we'll discuss this further.

June 04, 2026 12:00 AM UTC

June 03, 2026


Kay Hayen

Nuitka Release 4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.

Bug Fixes

Package Support

New Features

Optimization

Anti-Bloat

Organizational

Tests

Cleanups

Summary

This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.

The --project option seems usable now.

Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.

June 03, 2026 10:00 PM UTC


Real Python

How to Use GitHub Copilot Code Review in Pull Requests

GitHub offers several AI tools under the Copilot umbrella that cover your entire development workflow. Copilot can provide an AI-powered code review shortly after you open a pull request on GitHub. Rather than waiting for a teammate, you can add Copilot as a reviewer to receive context-aware feedback. With access to your entire codebase, it delivers actionable suggestions that you can apply in just a few clicks:

Pull requests are the standard collaborative workflow provided by GitHub and similar services like GitLab to facilitate code review for projects managed with Git. A pull request, or a PR for short, is a formal request to merge code from one branch—or fork—into another, and it’s where code review typically happens.

In practice, code review isn’t always timely or consistent. Some reviewers approve pull requests immediately without much scrutiny, while others leave long lists of minor nitpicks. It can also be difficult to find someone with the right level of experience or enough context about a specific part of the codebase. These issues are common in open-source projects as well, where reviews depend on the limited time of volunteer maintainers.

In this tutorial, you’ll learn how to leverage GitHub Copilot for AI-assisted code review in pull requests and how to integrate it into your workflow to get faster, more structured feedback. Whether you’re working on a commercial project or contributing to an open-source one, Copilot can help you catch issues early and improve your code before it’s merged.

Think of Copilot’s review as a fast first pass. It can reliably flag correctness mistakes and regressions to documented behavior, often before a human reviewer has even opened the PR.

Prerequisites

Before you get started with AI-assisted code reviews, make sure you have the following in place:

Depending on how you use GitHub, you may already have access to GitHub Copilot through your organization. Sometimes, you may qualify for Copilot under special conditions.

For example, if you’re a student or a teacher, or if you regularly contribute to a popular open-source project, then you might be eligible for free access to GitHub Copilot Pro. Check out GitHub Education to learn more. Keep in mind that GitHub reassesses whether you qualify for free access on a monthly basis.

But even on the free plan, you can still try out Copilot’s code review feature for 30 days at no cost. Just subscribe to GitHub Copilot Pro and cancel before the first billing cycle begins. The trial period is a one-time offer per account, so you won’t be able to start another one after the first one ends.

Note: At the time of writing, GitHub has temporarily paused new paid subscriptions for Copilot due to exceptionally high demand and the associated infrastructure costs. You can read the official announcement on GitHub’s blog to learn more.

To follow along with this tutorial, you’ll also need a GitHub repository where you can freely create branches and pull requests. Although you can create a new repository from scratch or import one from another Git-based hosting service, the quickest option is to download the provided supporting materials. They include a small, hands-on project you’ll be working on:

Get Your Code: Click here to download the free sample code you’ll use to practice AI-assisted code review on a sample FastAPI pull request with GitHub Copilot.

Take the Quiz: Test your knowledge with our interactive “How to Use GitHub Copilot Code Review in Pull Requests” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

How to Use GitHub Copilot Code Review in Pull Requests

Test your knowledge of GitHub Copilot code review in pull requests, including custom instructions and automatic reviews.

The sample project is a real-time quiz application inspired by Kahoot! and Mentimeter, featuring a FastAPI backend and a mobile-first JavaScript, HTML, and CSS frontend. It allows you to make your own quizzes from scratch—and store them in the human-readable YAML format—or generate a random quiz on the fly using ChatGPT’s API:

Each player is assigned a randomly generated name with an emoji, such as 🐯 Grumpy Tiger, 🩹 Gentle Skunk, or 🐼 Lazy Cow, to keep things light and fun. You can start the server on a local network and have your friends or family connect from their mobile devices using a QR code or a PIN.

Are you ready to dive in?

Step 1: Request a Code Review From GitHub Copilot

If you haven’t already, go ahead and grab the supporting materials. The sample Git repository includes a feature branch with intentional code issues that GitHub Copilot can catch when you request a review. For reference, you’ll also find another branch with the completed code to explore at your own pace:

Get Your Code: Click here to download the free sample code you’ll use to practice AI-assisted code review on a sample FastAPI pull request with GitHub Copilot.

After downloading the materials, upload the local pop-quiz repository—including all branches—to your GitHub account. This will create a remote copy of the repository for your own experimentation. There are several ways to accomplish this. Although you can handle most tasks through the GitHub web interface, the GitHub CLI is often faster and more convenient.

One straightforward approach is to use the GitHub CLI (gh) alongside standard git commands. This allows you to create the repository and push all branches in just two steps once you’re in the downloaded pop-quiz/ directory:

Read the full article at https://realpython.com/github-copilot-code-review/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 03, 2026 02:00 PM UTC

Quiz: How to Use GitHub Copilot Code Review in Pull Requests

In this quiz, you’ll test your understanding of How to Use GitHub Copilot Code Review in Pull Requests.

By working through this quiz, you’ll revisit how to request a review from Copilot on your pull requests, apply or push back on its suggestions, configure automatic reviews, and use custom instructions to make Copilot’s feedback follow your team’s conventions.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 03, 2026 12:00 PM UTC


Django Weblog

Django security releases issued: 6.0.6 and 5.2.15

In accordance with our security release policy, the Django team is issuing releases for Django 6.0.6 and Django 5.2.15. These releases address the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.

get_signed_cookie() derived the signing salt by concatenating the cookie name (key) and salt arguments. When distinct name and salt pairs produced the same concatenation, cookies could be accepted in a context different from the one where they were signed.

Cookies are now signed with an unambiguous salt derivation. For backwards compatibility, cookies signed by older Django versions are accepted until Django 7.0.

This issue has severity "low" according to the Django security policy.

Thanks to Peng Zhou for the report.

CVE-2026-7666: Potential unencrypted email transmission via STARTTLS in the SMTP backend

When using EMAIL_USE_TLS, a failed STARTTLS handshake could leave a partially-initialized connection that would subsequently be reused for sending email without encryption. This can occur with fail_silently=True, as used by send_mail() and BrokenLinkEmailsMiddleware, among others. Connections configured with EMAIL_USE_SSL are not affected.

This issue has severity "low" according to the Django security policy.

Thanks to Kasper Dupont for the report.

CVE-2026-8404: Potential exposure of private data via case-sensitive Cache-Control directives in UpdateCacheMiddleware

django.middleware.cache.UpdateCacheMiddleware and django.views.decorators.cache.cache_page decorator incorrectly cached responses marked with private Cache-Control directives when using mixed or uppercase values (e.g. Private).

The django.views.decorators.cache.cache_control decorator and django.utils.cache.patch_cache_control() function were not affected, since they normalize directives to lowercase. This issue only affects responses where Cache-Control is set manually.

This issue has severity "low" according to the Django security policy.

Thanks to Ahmed Badawe for the report.

CVE-2026-35193: Potential exposure of private data via missing Vary: Authorization in UpdateCacheMiddleware

django.middleware.cache.UpdateCacheMiddleware and django.views.decorators.cache.cache_page decorator allowed responses to requests bearing an Authorization header (and without Cache-Control: public) to be cached. To conform with the existing mechanism for constructing cache keys, responses to these requests will now vary on Authorization.

This issue has severity "low" according to the Django security policy.

Thanks to Shai Berger for the report.

CVE-2026-48587: Potential exposure of private data via whitespace padding in Vary header

django.middleware.cache.UpdateCacheMiddleware incorrectly cached responses whose Vary header values contained leading or trailing whitespace. Because has_vary_header() failed to strip that whitespace, a response with a Vary: * header (note the trailing space) was not recognized as containing the wildcard, causing it to be stored and potentially served from the cache when it should not have been.

This issue has severity "low" according to the Django security policy.

Thanks to Navid Rezazadeh for the report.

Affected supported versions

Resolution

Patches to resolve the issue have been applied to Django's main, 6.1 (currently at alpha status), 6.0, and 5.2 branches. The patches may be obtained from the following changesets.

CVE-2026-7666: Potential unencrypted email transmission via STARTTLS in the SMTP backend

CVE-2026-8404: Potential exposure of private data via case-sensitive Cache-Control directives in UpdateCacheMiddleware

CVE-2026-35193: Potential exposure of private data via missing Vary: Authorization in UpdateCacheMiddleware

CVE-2026-48587: Potential exposure of private data via whitespace padding in Vary header

The following releases have been issued

The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance, nor via the Django Forum. Please see our security policies for further information.

June 03, 2026 11:00 AM UTC


Python GUIs

Authentication and Authorization with PyQt6 or PySide6 — Secure your desktop applications with login flows, token-based auth, and role-based access control

How can I add authentication and authorization to a PyQt6 application? Is there something built into Qt to make this easier?

When you build a desktop application with PyQt6 or PySide6, sooner or later you'll need to control who can use it and what they can do. Maybe your app connects to a cloud service. Maybe certain features should only be available to administrators. Either way, you need authentication (verifying who the user is) and authorization (deciding what they're allowed to do).

Qt doesn't provide a built-in authentication framework. But that's fine. You can combine Qt's capabilities with Python's networking and security tools to build a solid auth flow for your application.

In this tutorial, we'll walk through the full process: creating a login dialog, authenticating against a remote server, handling tokens, and enabling or disabling parts of your UI based on a user's role.

Approaches to Authentication in Desktop Apps

Before writing any code, it helps to understand the options available when securing a desktop application. The right approach depends on how much security you need and what infrastructure you have.

  1. Simple login check Your app sends credentials to a remote server at startup. If authentication fails, you disable the UI (partially or entirely). This deters casual users, but a determined hacker could modify the client to bypass the check.
  2. Token-based unlock After a successful login, the server returns a token or key that unlocks functionality in the app. Without the token, the app can't perform certain operations. This is more secure — the app is genuinely non-functional without a valid token — though once data is decoded into memory, it's theoretically still accessible.
  3. Server-side execution After authentication, the app sends work to the server, which performs the actual operations. The sensitive logic never runs on the client at all. This is the most secure approach, but it requires server infrastructure to handle the workload.

In the Server-side execution model, the work done on the server doesn't necessarily need to be complex. Transforming or pre-processing some data from one format to another will be enough to deter most attempts at circumvention. However, it's common to to use this technique to hide the algorithmic "secret sauce" completely.

For most applications, the middle ground — authenticating against a remote API and using the returned token to gate access — provides a good balance of security and simplicity. That's what we'll build here.

Your app shouldn't care about the database directly. Instead, it should talk to an API (Application Programming Interface) on your server. The API handles user lookups, password verification, and token generation. Your desktop app just sends HTTP requests and processes the responses.

Setting Up a Simple Auth Server (For Testing)

To test our client application, we need something to authenticate against. We'll create a minimal Flask server that accepts login requests and returns a JSON Web Token (JWT). In a real project, this would be your existing backend, but having a self-contained example makes it easier to experiment.

Install the dependencies for the server:

sh
pip install flask pyjwt

Here's a minimal auth server:

python
import datetime

import jwt
from flask import Flask, jsonify, request

app = Flask(__name__)
SECRET_KEY = "your-secret-key-change-this"

# In production, use a real database with hashed passwords.
USERS = {
    "admin": {"password": "admin123", "role": "admin"},
    "viewer": {"password": "viewer123", "role": "viewer"},
}


@app.route("/auth/login", methods=["POST"])
def login():
    data = request.get_json()
    username = data.get("username", "")
    password = data.get("password", "")

    user = USERS.get(username)
    if user and user["password"] == password:
        token = jwt.encode(
            {
                "username": username,
                "role": user["role"],
                "exp": datetime.datetime.utcnow()
                + datetime.timedelta(hours=1),
            },
            SECRET_KEY,
            algorithm="HS256",
        )
        return jsonify(
            {"token": token, "role": user["role"], "username": username}
        )

    return jsonify({"error": "Invalid credentials"}), 401


@app.route("/auth/verify", methods=["GET"])
def verify():
    auth_header = request.headers.get("Authorization", "")
    if not auth_header.startswith("Bearer "):
        return jsonify({"error": "Missing token"}), 401

    token = auth_header.split(" ", 1)[1]
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
        return jsonify(
            {"username": payload["username"], "role": payload["role"]}
        )
    except jwt.ExpiredSignatureError:
        return jsonify({"error": "Token expired"}), 401
    except jwt.InvalidTokenError:
        return jsonify({"error": "Invalid token"}), 401


if __name__ == "__main__":
    app.run(port=5000, debug=True)

Save this as auth_server.py and run it in a separate terminal:

sh
python auth_server.py

The server exposes two endpoints:

This server stores passwords in plain text and uses a hardcoded secret key. In production, you'd hash passwords (using bcrypt or similar) and store the secret key securely. This is purely for demonstration.

Building the Login Dialog

Now let's build the PyQt6 side. We'll start with a login dialog — a modal window where the user enters their credentials. If you're new to dialogs in Qt, see our tutorial on creating dialogs in PyQt6 for a thorough introduction.

Install the client dependencies:

sh
pip install PyQt6 requests

If you're using PySide6, replace from PyQt6.QtWidgets import ... with from PySide6.QtWidgets import ... (and similarly for other Qt modules). The rest of the code is identical.

python
from PyQt6.QtCore import Qt
from PyQt6.QtWidgets import (
    QDialog,
    QFormLayout,
    QLabel,
    QLineEdit,
    QPushButton,
    QVBoxLayout,
)


class LoginDialog(QDialog):
    def __init__(self, parent=None):
        super().__init__(parent)
        self.setWindowTitle("Login")
        self.setFixedSize(350, 200)

        layout = QVBoxLayout()

        self.form_layout = QFormLayout()

        self.username_input = QLineEdit()
        self.username_input.setPlaceholderText("Enter your username")
        self.form_layout.addRow("Username:", self.username_input)

        self.password_input = QLineEdit()
        self.password_input.setPlaceholderText("Enter your password")
        self.password_input.setEchoMode(QLineEdit.Password)
        self.form_layout.addRow("Password:", self.password_input)

        layout.addLayout(self.form_layout)

        self.login_button = QPushButton("Login")
        self.login_button.clicked.connect(self.accept)
        layout.addWidget(self.login_button)

        self.status_label = QLabel("")
        self.status_label.setAlignment(Qt.AlignCenter)
        self.status_label.setStyleSheet("color: red;")
        layout.addWidget(self.status_label)

        self.setLayout(layout)

        # Allow pressing Enter to submit.
        self.password_input.returnPressed.connect(self.login_button.click)
        self.username_input.returnPressed.connect(
            self.password_input.setFocus
        )

    def get_credentials(self):
        return (
            self.username_input.text().strip(),
            self.password_input.text(),
        )

    def set_status(self, message):
        self.status_label.setText(message)

This dialog inherits from QDialog, which gives us the modal behavior we need — when shown with .exec_(), it blocks interaction with the rest of the application until the user either logs in or closes the dialog.

The get_credentials method returns the entered username and password as a tuple. The set_status method lets us display error messages (like "Invalid credentials") directly in the dialog.

Creating an Auth Manager

Rather than scattering authentication logic throughout the application, we'll encapsulate it in a dedicated class. This AuthManager handles login requests, stores the token, and provides the user's role.

python
import requests


class AuthManager:
    def __init__(self, base_url="http://localhost:5000"):
        self.base_url = base_url
        self.token = None
        self.username = None
        self.role = None

    def login(self, username, password):
        """
        Attempt to log in. Returns True on success, False on failure.
        Raises an exception on network errors.
        """
        response = requests.post(
            f"{self.base_url}/auth/login",
            json={"username": username, "password": password},
            timeout=10,
        )

        if response.status_code == 200:
            data = response.json()
            self.token = data["token"]
            self.username = data["username"]
            self.role = data["role"]
            return True

        return False

    def is_authenticated(self):
        return self.token is not None

    def get_auth_header(self):
        """Return headers dict with the Bearer token for API requests."""
        if self.token:
            return {"Authorization": f"Bearer {self.token}"}
        return {}

    def has_role(self, role):
        return self.role == role

    def logout(self):
        self.token = None
        self.username = None
        self.role = None

The get_auth_header method is especially useful. Once a user has logged in, you can include this header in any subsequent API call to prove that the request is coming from an authenticated user:

python
response = requests.get(
    "http://localhost:5000/some/protected/endpoint",
    headers=auth_manager.get_auth_header(),
    timeout=10,
)

Wiring Up the Login Flow

Now we connect the login dialog to the auth manager. The pattern is: show the dialog, grab the credentials, try to authenticate, and either proceed to the main window or show an error.

python
import sys

from PyQt6.QtWidgets import QApplication, QMessageBox


def attempt_login(auth_manager):
    """
    Show the login dialog repeatedly until the user either
    successfully authenticates or cancels.
    Returns True on successful login, False if cancelled.
    """
    dialog = LoginDialog()

    while True:
        result = dialog.exec_()

        if result != QDialog.Accepted:
            # User closed the dialog or pressed Cancel.
            return False

        username, password = dialog.get_credentials()

        if not username or not password:
            dialog.set_status("Please enter both fields.")
            continue

        try:
            if auth_manager.login(username, password):
                return True
            else:
                dialog.set_status("Invalid username or password.")
        except requests.exceptions.ConnectionError:
            dialog.set_status("Cannot connect to server.")
        except requests.exceptions.Timeout:
            dialog.set_status("Connection timed out.")
        except requests.exceptions.RequestException as e:
            dialog.set_status(f"Error: {e}")

This function keeps showing the login dialog until either the login succeeds or the user dismisses it. Network errors are caught and displayed in the dialog, so the user gets useful feedback without the app crashing.

Building the Main Window with Role-Based Access

The main window of our application will show different features depending on the user's role. Admin users see everything; viewers have a restricted experience. We'll use actions, toolbars, and menus to structure the interface.

python
from PyQt6.QtWidgets import (
    QAction,
    QMainWindow,
    QMenu,
    QMenuBar,
    QStatusBar,
    QTextEdit,
    QToolBar,
)


class MainWindow(QMainWindow):
    def __init__(self, auth_manager):
        super().__init__()
        self.auth_manager = auth_manager

        self.setWindowTitle("My Application")
        self.setMinimumSize(600, 400)

        # Central widget.
        self.text_edit = QTextEdit()
        self.setCentralWidget(self.text_edit)

        # Menu bar.
        menu_bar = self.menuBar()

        file_menu = menu_bar.addMenu("&File")

        self.save_action = QAction("&Save", self)
        self.save_action.triggered.connect(self.save_document)
        file_menu.addAction(self.save_action)

        file_menu.addSeparator()

        logout_action = QAction("&Logout", self)
        logout_action.triggered.connect(self.handle_logout)
        file_menu.addAction(logout_action)

        quit_action = QAction("&Quit", self)
        quit_action.triggered.connect(self.close)
        file_menu.addAction(quit_action)

        # Admin-only menu.
        self.admin_menu = menu_bar.addMenu("&Admin")

        manage_users_action = QAction("&Manage Users", self)
        manage_users_action.triggered.connect(self.manage_users)
        self.admin_menu.addAction(manage_users_action)

        server_settings_action = QAction("&Server Settings", self)
        server_settings_action.triggered.connect(self.server_settings)
        self.admin_menu.addAction(server_settings_action)

        # Status bar.
        self.status_bar = QStatusBar()
        self.setStatusBar(self.status_bar)

        # Apply role-based restrictions.
        self.apply_permissions()

    def apply_permissions(self):
        """Enable or disable UI elements based on the user's role."""
        role = self.auth_manager.role
        username = self.auth_manager.username

        self.status_bar.showMessage(
            f"Logged in as {username} ({role})"
        )

        if role == "admin":
            # Admins get full access.
            self.admin_menu.setEnabled(True)
            self.save_action.setEnabled(True)
            self.text_edit.setReadOnly(False)
        elif role == "viewer":
            # Viewers can see content but not edit or access admin.
            self.admin_menu.setEnabled(False)
            self.save_action.setEnabled(False)
            self.text_edit.setReadOnly(True)
            self.text_edit.setPlaceholderText(
                "You have read-only access."
            )
        else:
            # Unknown role: disable everything as a safe default.
            self.admin_menu.setEnabled(False)
            self.save_action.setEnabled(False)
            self.text_edit.setReadOnly(True)

    def save_document(self):
        QMessageBox.information(
            self, "Save", "Document saved (placeholder)."
        )

    def manage_users(self):
        QMessageBox.information(
            self, "Admin", "User management (placeholder)."
        )

    def server_settings(self):
        QMessageBox.information(
            self, "Admin", "Server settings (placeholder)."
        )

    def handle_logout(self):
        self.auth_manager.logout()
        self.close()

The apply_permissions method is where authorization happens. After a successful login, we check the user's role and adjust the UI accordingly. Disabled menu items are grayed out and non-clickable, and the text editor is set to read-only for viewers.

This approach — enabling and disabling widgets based on roles — is the standard pattern for authorization in desktop apps. You can extend it as far as you need: hide entire toolbar sections, show different pages in a stacked widget, or restrict access to specific actions.

Making Authenticated API Requests

Once a user is logged in, you'll often need to make further API calls — fetching data, submitting forms, etc. Each of these requests should include the authentication token so the server can verify the user. For long-running API calls, consider using multithreading with QThreadPool to keep the UI responsive while waiting for server responses.

Here's how you might fetch some protected data:

python
def fetch_protected_data(auth_manager):
    """Example of making an authenticated API request."""
    try:
        response = requests.get(
            f"{auth_manager.base_url}/auth/verify",
            headers=auth_manager.get_auth_header(),
            timeout=10,
        )

        if response.status_code == 200:
            return response.json()
        elif response.status_code == 401:
            # Token expired or invalid — user needs to log in again.
            return None
    except requests.exceptions.RequestException:
        return None

If the server responds with a 401 Unauthorized, that means the token has expired or been revoked. You should handle this gracefully — for example, by showing the login dialog again.

Handling Token Expiration

Tokens expire. When they do, your app needs to respond appropriately rather than silently failing. A common approach is to wrap your API calls in a method that checks for 401 responses and triggers a re-login:

python
def authenticated_request(auth_manager, method, url, **kwargs):
    """
    Make an HTTP request with authentication.
    Returns the response, or None if re-authentication fails.
    """
    kwargs.setdefault("headers", {})
    kwargs["headers"].update(auth_manager.get_auth_header())
    kwargs.setdefault("timeout", 10)

    try:
        response = requests.request(method, url, **kwargs)

        if response.status_code == 401:
            # Token expired — try to re-authenticate.
            if attempt_login(auth_manager):
                kwargs["headers"].update(
                    auth_manager.get_auth_header()
                )
                response = requests.request(method, url, **kwargs)
            else:
                return None

        return response

    except requests.exceptions.RequestException:
        return None

This function automatically retries the request with a new token if the first attempt gets a 401. The user sees the login dialog, re-enters their credentials, and the request proceeds as if nothing happened.

To try it out:

  1. Start the auth server in one terminal: python auth_server.py
  2. Run the client application in another terminal: python app.py
  3. Log in as admin / admin123 to see full access, or viewer / viewer123 to see restricted access.

Try logging in with the wrong password — the dialog stays open and shows an error. Close the dialog without logging in and the app exits cleanly.

Security Considerations

A few things to keep in mind when implementing auth in a desktop application:

Never store passwords in the client. Your app should only ever send credentials to the server and receive a token back. The token is what you store (in memory, or securely on disk if you want "remember me" functionality).

Use HTTPS in production. Our example uses plain HTTP because it's running locally. In a real deployment, all communication between the client and server should be encrypted with TLS. The requests library handles HTTPS transparently — just change the URL to https://.

Tokens are temporary. JWTs (and most authentication tokens) have an expiration time. Design your app to handle expired tokens gracefully, as shown in the token expiration section above.

Client-side checks are not enough. Disabling a button in the UI doesn't prevent a technically savvy user from calling the underlying function. Any action that matters should be validated on the server side too. The client-side restrictions are a UX convenience, not a security boundary.

Store tokens securely. If you implement a "remember me" feature that persists the token between sessions, use your platform's secure storage — keyring is a good cross-platform Python library for this. Don't write tokens to plain text files. You can also use QSettings to persist non-sensitive user preferences like the last-used username, but avoid storing tokens or credentials there since QSettings does not provide encryption.

For an in-depth guide to building Python GUIs with PySide6 see my book, Create GUI Applications with Python & Qt6.

June 03, 2026 06:00 AM UTC


Bob Belderbos

How to Tell if Your Python Mock Is Actually Working

A test can pass for the wrong reason. When you're mocking a third-party API call, the test might look green because the real API happened to return an error, not because your mock did anything at all.

This came up in a recent session in our agentic AI cohort where we were looking at a test to verify that converting to an invalid currency raised an exception. The test passed. But something felt off.

The test that passed for the wrong reason

The code under test calls the ExchangeRate API and raises CurrencyConversionError when the response signals failure:

def convert_currency(amount: Decimal, from_currency: str, to_currency: str) -> Decimal:
    if not EXCHANGE_RATE_API_KEY:
        raise UndefinedValueError("EXCHANGE_RATE_API_KEY must be set")
    if from_currency == to_currency:
        return amount
    response = requests.get(
        f"https://v6.exchangerate-api.com/v6/{EXCHANGE_RATE_API_KEY}/pair/{from_currency}/{to_currency}"
    )
    data = response.json()
    if data["result"] != "success":
        raise CurrencyConversionError(f"{data['error-type']}")
    return Decimal(data["conversion_rate"]) * amount

The test set up a mock_response, patched requests.get to return it (mock_get.return_value = mock_response), but configured it as a successful response:

mock_response.json.return_value = {
    "result": "success",   # <-- this will never raise CurrencyConversionError
    "conversion_rate": 1.5,
}

If the mock was intercepting, the function would return normally and pytest.raises would fail. But the test was passing. That meant either the mock wasn't intercepting at all and the real API was returning an error for "CTM", or the test was broken in a non-obvious way.

Proving the mock actually intercepted

My instinct was to add print("calling external api") before requests.get. That proves the code reached that line. It does not prove whether the mock intercepted the call or the real network was hit.

At this point you can put a breakpoint() in the actual requests.get code in your venv, but there is a better way: mock_get.assert_called_once():

with pytest.raises(CurrencyConversionError):
    convert_currency(
        amount=Decimal("1.00"),
        from_currency="CAD",
        to_currency="CTM",  # Canadian Tire Money — not a real currency
    )
mock_get.assert_called_once()

If the mock was never called, this assertion fails and tells you directly: your patch didn't intercept the request. If the mock was called, the assertion passes and you know for sure that the test is relying on the mock, not the real API.

Running this revealed the mock was intercepting. But now pytest.raises failed with DID NOT RAISE. The mock response still signaled success, so nothing raised. Fixing it to signal an error made the test pass for the right reason:

mock_get.return_value.json.return_value = {
    "result": "error",
    "error-type": "unknown-code",
}

Two things to get right when patching

1. The patch target must match where the name is used, not where it's defined.

The currency module does import requests then calls requests.get(...). So the patch target is expenses_ai_agent.utils.currency.requests.get, not requests.get. Patching the wrong location is a common mistake that leads to the mock not intercepting and the real API being called.

2. Module-level variables need patching too.

EXCHANGE_RATE_API_KEY is loaded at import time:

EXCHANGE_RATE_API_KEY = config("EXCHANGE_RATE_API_KEY", default="")

The function checks if not EXCHANGE_RATE_API_KEY: before making any request. If a real key is in the environment, this check passes and you never get to verify the mock. Patch the module-level variable alongside requests.get:

mocker.patch("expenses_ai_agent.utils.currency.EXCHANGE_RATE_API_KEY", "test-key")

Or use pytest's monkeypatch fixture to override the environment variable before import:

monkeypatch.setenv("EXCHANGE_RATE_API_KEY", "test-key")

This will override the environment variable for the duration of the test, so when the module imports and reads it, it gets "test-key" instead of the real key.

As a sidenote, things defined at module scope are a serious risk for side consequences and making your code harder to maintain, see: Two Interesting Scoping Bugs That Made Me Reflect on Object Lifetimes.

The cleaned-up test with pytest-mock

Once the mock response was correct and interception was verified, the test got two more improvements. First, the intermediate mock_response variable is unnecessary — you can chain directly off mock_get.return_value:

mock_get.return_value.json.return_value = {
    "result": "error",
    "error-type": "unknown-code",
}

Second, pytest-mock (added with uv add --dev pytest-mock) replaces the nested with patch(...) context managers with a mocker fixture. The result is flatter and easier to scan. Annotated:

def test_bad_currency_conversion_raises(self, mocker):
    """Converting to a non-existing currency should raise an exception."""
    # Replace the module-level EXCHANGE_RATE_API_KEY so the guard
    # (if not EXCHANGE_RATE_API_KEY) doesn't abort before we reach requests.get
    mocker.patch("expenses_ai_agent.utils.currency.EXCHANGE_RATE_API_KEY", "test-key")
    # Patch requests.get *as imported inside the currency module* so no
    # real HTTP call is made; patch target must match where the name is used
    mock_get = mocker.patch("expenses_ai_agent.utils.currency.requests.get")
    # Simulate the API response for an unrecognised currency code
    mock_get.return_value.json.return_value = {
        "result": "error",
        "error-type": "unknown-code",
    }

    with pytest.raises(CurrencyConversionError):
        convert_currency(
            amount=Decimal("1.00"),
            from_currency="CAD",
            to_currency="CTM",
        )
    # Confirm the mock intercepted the call; if this fails, the real API was hit
    mock_get.assert_called_once()

mocker also handles teardown automatically via the fixture lifecycle, so you don't need with to ensure cleanup.

Another reason to mock: forcing a collision

So far the mock has stood in for a network call. That's not the only reason to reach for one. Here's a test from my simple CRM that stores contacts as files on disk:

def create_contact(
    name: str, email: str = "", company: str = "", product: str = ""
) -> str:
    contacts_dir().mkdir(parents=True, exist_ok=True)
    code = next_code(name)
    path = contact_path(code)
    if path.exists():
        raise FileExistsError(f"Contact {code} already exists")
    path.write_text(...)
    return code

next_code generates a unique code from the name. To test that creating two contacts with the same code raises FileExistsError, you need both calls to produce the same code. That's nondeterministic by design, so you patch next_code to pin it:

@patch("crm.data.next_code")
def test_cannot_create_contact_with_same_code(mock_next_code):
    mock_next_code.return_value = "jd1"
    data.create_contact("Jane Doe")
    with pytest.raises(FileExistsError):
        data.create_contact("Jane Doe")

Note the patch target again: crm.data.next_code, where the function is used. Same rule as before. And note that's the only mock here.

Isolation matters as much as the mock, but it doesn't belong in this test. An autouse fixture already points the data dir at a fresh tmp_path:

@pytest.fixture(autouse=True)
def crm_data(tmp_path, monkeypatch):
    monkeypatch.setenv("CRM_DATA", str(tmp_path))
    (tmp_path / "contacts").mkdir()
    return tmp_path

create_contact calls path.write_text(...), so the first call writes a real jd1 file. Because every test runs against a fresh tmp_path, that file lives only for the test: the collision can only come from the second call, nothing leaks between runs, and the test fails solely when the duplicate guard fires. Without that isolation, a leftover jd1 from a previous run makes the first call raise, pytest.raises still passes, and you've tested nothing.

Update: I later dropped this mock for dependency injection. Instead of patching next_code, I gave create_contact an optional code parameter (keyword-only, so it can't be passed by accident):

def create_contact(name: str, *, email: str = "", company: str = "",
                    product: str = "", code: str | None = None) -> str:
    ...
    code = code if code is not None else next_code(name)

The test pins the code through the public surface, no patching:

def test_cannot_create_contact_with_same_code():
    data.create_contact("Jane Doe")
    with pytest.raises(FileExistsError):
        data.create_contact("Jane Doe", code="jd1")

The trade-off is worth being honest about: I added a production parameter partly to make the test simpler. That's exactly the "test-induced design damage" critics of mocking also warn about: a seam that exists only to serve tests. I think it's justified here because code doubles as a real feature: an explicit-code escape hatch for imports or restoring from backup. The test just happens to use it. If the parameter was only added for the test, I'd consider leaving the mock.

Unit vs integration: where does this test belong?

All this then led to a related question:

How should you organize tests that hit real external services?

The convention that holds up in practice:

tests/
├── unit/        # fast, fully mocked, no network, no secrets
└── integration/ # slower, hits real DB / LLM / API endpoints

The currency test above belongs in unit/: it mocks requests.get and needs no real API key. A test that actually calls the ExchangeRate API to verify end-to-end behavior belongs in integration/.

A @pytest.mark.integration marker is a lighter-weight way to get the same split without moving files. Register it in pyproject.toml, then skip those tests in CI with pytest -m 'not integration'.

Both work, but the directory structure makes the distinction obvious at a glance. Explicit is better than implicit.

The practical rule: if your test needs an environment variable or some external service to do its real work, it's an integration test. Mock that dependency out and it becomes a unit test. Or put it at the boundary so you can inject a fake in unit tests and the real thing in integration tests (if still needed).

For a practical example of test organization, see this video: Python Unit vs. Functional Testing: Understanding the Difference + Practical Example.

When mocks are the wrong tool

There's a broader point underneath all this. Every time you patch requests.get you're writing a test that's tightly coupled to one import path. Change import requests to from requests import get and every patch breaks. The tests test implementation, not behavior.

I highly recommend watching Harry Percival's PyCon talk "Stop Using Mocks". He makes the case for alternatives: build an adapter class that owns the external call, write a fake in-memory implementation of it, and use dependency injection to pass it in. The repository pattern is the same idea: your test passes in a fake, your production code passes in the real thing, and neither needs patching.

Mocks are still the right choice here: we want to test one small unit whose only external dependency is well contained.

Keep reading

June 03, 2026 12:00 AM UTC

June 02, 2026


PyCoder’s Weekly

Issue #737: Polars 1.41, Email, Great Docs, and More (2026-06-02)

#737 – JUNE 2, 2026
View in Browser »

The PyCoder’s Weekly Logo


Announcing Polars 1.41

Polars 1.41 is out and this post covers the new features it includes. Learn about faster parquet metadata decoding, nested subplan elimination, and more.
POLA.RS

Sending Emails With Python

Learn how to send emails with Python using SMTP, attach files, format HTML messages, and personalize bulk emails for your contact list.
REAL PYTHON

Quiz: Sending Emails With Python

Use Python’s standard library to send email through secure SMTP connections, attach files, include HTML content, and route replies.
REAL PYTHON

Your Coding Agent Gets Dumber the Longer It Runs. Here’s the Fix.

alt

Coding agents degrade as context grows. The fix: a multi-role loop where the planner, builder, and reviewer each get isolated context — no stale assumptions, no compounding noise. A practical breakdown from someone who built it. Read the full breakdown
DEPOT sponsor

Great Docs

Talk Python interviews Rich Iannone and Michael Chow from Posit and they talk about a new Python documentation tool called Great Docs.
TALK PYTHON podcast

PyPy v7.3.23 Released

PYPY.ORG

Articles & Tutorials

Improving Python Through PEPs and Protocols

Have you ever been confused by the naming of modules you’re importing from a package? Is there a standard way to organize and name your Python virtual environments? This week on the show, Brett Cannon returns to discuss the Python Enhancement Proposals (PEPs) he’s been working on recently.
REAL PYTHON podcast

Tame Your Pesky Little Scripts

Over time it is common to accumulate little helper scripts, whether they’re shell scripts, aliases, or custom functions. They are typically tiny things that can become unwieldy to manage. This post shares a few ideas that might help you take back control.
JUHA-MATTI SANTALA

5-Day Live OOP Workshop (Final Chance to Enroll)

The Object-Oriented Python live cohort begins June 8. Five 2-hour sessions Mon to Fri build one growing application end to end, with OOP features introduced as the code starts needing them: classes, the data model, inheritance vs composition, properties, dataclasses.
REAL PYTHON sponsor

Free-Threading vs the GIL in mod_wsgi 6.0.0

Free-threading in mod_wsgi 6.0.0 lets a single process spread Python work across multiple cores. This post is a metrics based comparison between the GIL being enabled and disabled.
GRAHAM DUMPLETON

Notes About Python Email Packages

Chris recently upgraded his personal mail program from Python 2 to Python 3 and this post talks about what needed to change and notes how the newer code works.
CHRIS SIEBENMANN

Learning Path: Perfect Your Python Development Setup

Set up a Python development environment with VS Code, PyCharm, virtual environments, Git, pyenv, Docker, and AI coding tools like Claude Code and Cursor.
REAL PYTHON

Top 7 Python Libraries for Large-Scale Data Processing

This article covers Python libraries that make large-scale data processing faster, more scalable, and easier to manage across modern data workflows.
BALA PRIYA C

Connecting LLMs to Your Data With Python MCP Servers

Build an MCP server in Python that exposes tools, resources, and prompts so AI agents like Cursor can interact with your data.
REAL PYTHON course

How to Make a Scatter Plot in Python With plt.scatter()

Learn how to make scatter plots in Python with plt.scatter() and customize markers by size, color, shape, and transparency.
REAL PYTHON

Quiz: How to Make a Scatter Plot in Python With plt.scatter()

REAL PYTHON

Two Python Scoping Bugs: A Lesson in Object Lifetimes

Two Python bugs with opposite symptoms but the same root cause: picking the wrong scope for a stateful object.
BOB BELDERBOS

Sentinel Built-In

A quick post about Python 3.15’s new sentinel built-in.
RODRIGO GIRÃO SERRÃO

Projects & Code

dj-lite-tenant: Multi-Tenant SQLite Databases for Django

GITHUB.COM/ADAMGHILL

Lifeguard: Detect Lazy Imports Incompatibilities

GITHUB.COM/FACEBOOK

nbpipe: Run Sequences of Jupyter Notebooks as a Workflow

GITHUB.COM/NGAFAR

httpx2: A Next Generation HTTP Client for Python

GITHUB.COM/PYDANTIC

mkdocs-marimo: Mkdocs Plugin for Marimo

GITHUB.COM/MARIMO-TEAM

Events

Weekly Real Python Office Hours Q&A (Virtual)

June 3, 2026
REALPYTHON.COM

Canberra Python Meetup

June 4, 2026
MEETUP.COM

Sydney Python User Group (SyPy)

June 4, 2026
SYPY.ORG

GeoPython 2026

June 8 to June 11, 2026
GEOPYTHON.NET

PiterPy Meetup

June 9, 2026
PITERPY.COM

SciPy 2026, Minneapolis, MN

July 13-19, 2026
SCIPY.ORG ‱ Shared by SciPy Organizers


Happy Pythoning!
This was PyCoder’s Weekly Issue #737.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

June 02, 2026 07:30 PM UTC


Real Python

Structuring Your Python Script

You may have begun your Python journey interactively, exploring ideas within Jupyter Notebooks or through the Python REPL. While that’s great for quick experimentation and immediate feedback, you’ll likely find yourself saving code into .py files. However, as your codebase grows, knowing where things should go in your script becomes increasingly important.

Transitioning from interactive environments to structured scripts helps promote readability, enabling better collaboration and more robust development practices. This video course shows you the foundations of organizing a Python script: where the runnable bits go, how to arrange your imports, and how to refactor with constants and a fixed entry point.

By the end of this video course, you’ll know how to:

Without further ado, it’s time to start working through a concrete script and progressively shape it into well-organized, shareable code.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 02:00 PM UTC


PyCharm

Top Agentic Frameworks for Building Applications 2026

In 2026, the world of AI is changing at a serious pace. The days of AI systems dealing solely in single-prompt interactions are coming to an end. Instead, these models are evolving into agentic systems – long-running, goal-driven software enabled by agentic frameworks that are becoming a critical layer in modern application architecture.

This rapid shift means that Python developers building autonomous systems are increasingly relying on agentic frameworks to manage reasoning, memory, tools, and collaboration among multiple agents.

You’ve probably already heard of some of the most popular frameworks. LangChain and AutoGen have risen to prominence, but there are dozens more, many of them open-source and only one to two years old. With so many frameworks promising different agentic capabilities, the real challenge is knowing which ones are best suited for the kind of application you want to build.

Let’s take a closer look at some of the most important agentic frameworks on the market in 2026, comparing what each does best and rating them based on our key comparison criteria to help you discover which is best for your projects.

What are AI agents?

An AI agent is a piece of software capable of autonomously reasoning, setting goals, and performing tasks on behalf of a user or another system. As the name suggests, AI agents have a level of agency to learn, adapt, and make decisions independently. This means they can improve their behavior and, over time, choose their own actions to achieve specific goals or outcomes.

AI agents work by following a perceive, reason, act, reflect (PRAR) cycle, which allows them to:

AI agents rely on the natural language processing capabilities of large language models, but unlike traditional LLMs and AI chatbots, they don’t require continuous user input to perform tasks. Agents are proactive, working autonomously to achieve a goal based on a specified set of rules and parameters.

What is an agentic framework?

An agentic framework provides the infrastructure needed to build, run, and control AI agents at scale. Most modern frameworks offer three core capabilities:

While it’s possible to build an agent without a framework, they’re vital in ensuring agents are reliable, scalable, and safe.

Agentic frameworks help turn experimental agent builds into maintainable software by facilitating:

Core orchestration paradigms

Before comparing individual frameworks, it’s important to understand how they operate. Let’s look at the three most commonly used orchestration models in 2026.

Graph-based orchestration

Graph-based orchestration provides maximum control by organizing agents and tools as nodes in a directed graph. Instead of letting an agent freely decide what to do next, the flow that agents are allowed to follow is clearly defined.

Strengths

Limitations

Role-based orchestration

Role-based orchestration is most effective when simplicity is a priority. Agents are assigned specific roles, such as “Planner”, “Researcher”, or “Builder”, and collaborate by sending messages to one another.

Strengths

Limitations

Chain-based orchestration

Chain-based orchestration, also known as adaptive orchestration, arguably offers the greatest flexibility. Agents in this model operate in dynamic chains or loops, deciding the next step autonomously.

Strengths

Limitations

Best agentic frameworks for your projects

Now that we’re familiar with the key orchestration paradigms of agentic frameworks, it’s time to compare some of the most popular frameworks on the market in 2026. Below, we evaluate each framework’s performance against our key comparison criteria:

FrameworkOrchestration modelMulti-agent supportMemory capabilitiesHITL supportBest used for
LangChainChain-basedPartialModerateLimited to moderateRapid LLM app development
LangGraphGraph-basedYesStrongStrongProduction-grade agent workflows
LlamaIndexRetrieval-centricLimitedStrongModerateKnowledge-heavy agents
HaystackPipeline-based/modularModerateStrongModerateProduction RAG and context-heavy AI systems
AutoGenRole-basedStrongModerateLimitedConversational multi-agent systems
CrewAIRole-basedStrongLightLimitedTask-oriented agent teams
Semantic KernelPlanner-basedModerateModerateStrongEnterprise AI
smolagentsMinimalistLimitedLightMinimalLightweight experiments
OpenAI Agents SDKGraph-basedYesManagedStrongHosted agent applications
PhidataAgent-centricLimited to moderateStrongModerateData and tool-heavy agents

Let’s take a closer look at the strengths and weaknesses of each framework, along with the applications they’re most suited to.

LangChain

Launched in 2022, LangChain is one of the most widely adopted frameworks due to its broad ecosystem of integrations. It serves as an accessible interface for nearly any LLM and is an ideal starting point for enthusiasts or startups looking to explore agentic AI. While not strictly “agent-first”, it provides the building blocks for agentic behavior.

LangChain provides less control than other frameworks, but it’s still a fantastic entry point into agentic systems, especially for projects where speed and creativity take precedence over enforcing strict workflows.

Strengths

Limitations

Best applications

If you want to go beyond the basics, read our LangChain Python Tutorial: A Complete Guide for 2026. It takes a deeper look at what LangChain offers and walks through real-world use cases for building AI agents in Python.

LangGraph

LangGraph has emerged as the leading standard for production-grade agent systems. Built on top of LangChain, it replaces implicit chains with explicit graphs, providing strict control over workflows and excellent HITL support via interrupts.

While the graph structure itself can actually make debugging easier by clearly mapping how agents and tools interact, LangGraph does come with a learning curve. Much of this complexity comes from designing the graph and managing explicit state between nodes. Once you understand these concepts, the framework becomes a powerful option for building predictable and controllable agent systems.

Strengths

Limitations

Best applications

LlamaIndex

LlamaIndex is a Python framework designed to help AI systems understand, store, and retrieve information from large amounts of documents and data.

Rather than starting with agents and adding data later, LlamaIndex takes the opposite approach – it starts with data and then builds agent behavior around it. This is why it is often described as data-first or retrieval-centric.

Because it operates in this way, LlamaIndex excels at indexing, memory, and retrieval, making it ideal for building agents whose intelligence depends on accessing the right information rather than executing complex actions.

Strengths

Limitations

Best applications

Haystack

Haystack is an open-source AI orchestration framework created by deepset for building production-ready AI agents, retrieval-augmented generation (RAG) systems, and multimodal applications.

Instead of focusing purely on agent behavior, Haystack structures applications as explicit pipelines composed of retrievers, routers, memory layers, tools, evaluators, and generators. This modular architecture gives you control over how information flows through a system, allowing each component to be tested and improved independently.

Haystack is particularly strong in applications where the quality of retrieved information determines the quality of the model’s output. Its design also makes it well-suited for enterprise environments that require transparency and reliability in production systems.

Strengths 

Limitations 

Best applications

AutoGen

AutoGen, an open-source Microsoft framework, popularized the idea of agents collaborating through structured conversation, organizing systems as teams of agents, each with its own specific role. Unlike in other frameworks, there’s no central controller enforcing a strict execution path – the collaboration itself drives progress.

This approach makes AutoGen ideal for exploratory, creative, and research-driven multi-agent systems, at the cost of predictability, HITL, and strict execution control.

Strengths 

Limitations 

Best applications

CrewAI

CrewAI is centered around building simple, structured multi-agent systems. It is similar to AutoGen, modeling AI agents as members of a “crew” where each agent has a clearly defined role. The goal is to make multi-agent systems approachable, even if you are new to agentic AI.

CrewAI prioritizes simplicity and speed over deep memory and production controls, making it easy to learn and a strong option for prototypes and small teams. However, its limited toolset for observability, HITL, and error handling at scale makes it less suited for larger systems.

Strengths

Limitations

Best applications

Semantic Kernel

Semantic Kernel is another open-source Microsoft framework, designed for building AI-powered applications that integrate with existing enterprise systems.

It was created with production concerns in mind from the start, emphasizing governance, safety, observability, and human oversight. Rather than maximizing agent autonomy, it focuses on making AI predictable, controllable, and auditable.

By combining structured workflows with LLM reasoning, it trades flexibility and emergent behavior for trust, safety, and operational reliability.

Strengths

Limitations

Best applications

smolagents

smolagents is a bare-bones framework designed to make agentic AI as straightforward and transparent as possible. It prioritizes simple, readable code that makes it easy to understand how an agent works without needing to learn a large framework.

smolagents aims to make agent behavior accessible and easy to experiment with by keeping abstractions minimal and logic transparent. It offers first-class support for code-based and tool-calling agents, broad model and tool compatibility, and lightweight CLI utilities, while intentionally trading large-scale orchestration and production features for simplicity and clarity.

Strengths

Limitations

Best applications

OpenAI Agents SDK

Thanks to ChatGPT’s explosion in popularity, we’ve all heard of OpenAI. The Agents SDK is the company’s effort to provide a managed platform for building and running agents without having to maintain your own orchestration infrastructure.

Rather than assembling agents from scratch, you define agent behavior and workflows, while OpenAI provides orchestration, memory management, monitoring, and safety controls. This makes the Agents SDK particularly attractive for teams that want production-ready agents quickly.

Strengths

Limitations

Best applications

Phidata

Phidata is designed for building practical, tool-driven AI agents that operate on real-world data.

Rather than focusing on abstract orchestration patterns, Phidata centers the agent around direct interaction with systems such as APIs, databases, and internal services.

Its design reflects the fact that many agents spend most of their time fetching, transforming, and acting on data.

Strengths

Limitations

Best applications

Choosing the right framework

Now that you’re familiar with many of the most popular frameworks in 2026, it’s time to choose the right one for your project. Let’s take a look at some of the key use cases, along with the frameworks that fit them best.

Orchestration modelWhere to useRecommended frameworks
Graph-basedProjects involving complex branching logic and requiring high levels of reliability, auditability, and control.LangGraph, OpenAI Agents SDK
Role-basedProjects involving rapid development and intuitive design that benefit from emergent collaboration between agents.AutoGen, CrewAI
Chain-basedProjects requiring maximum flexibility, where agents need to adapt dynamically and determine next steps autonomously.LangChain
Retrieval-basedProjects where deep, reliable access to knowledge matters more than high levels of autonomy.LlamaIndex, Haystack
Enterprise-orientedProjects where strong governance and human-in-the-loop processes are non-negotiable requirements.Semantic Kernel
LightweightRapid prototyping, educational use, and simple local agents where transparency and control matter more than orchestration complexity.smolagents
Tool-centricBuilding production agents that primarily interact with APIs, databases, and external systems rather than complex multi-step orchestration.Phidata

In 2026, agentic frameworks have evolved from experimental tools into foundational infrastructure for many applications. The key decision is no longer whether to use agents, but how much control, autonomy, and governance your systems require.

June 02, 2026 12:12 PM UTC


Real Python

Quiz: Python's Format Mini-Language for Tidy Strings

In this quiz, you’ll test your understanding of Python’s Format Mini-Language for Tidy Strings.

By working through this quiz, you’ll revisit how format specifiers work inside f-strings and str.format(), including alignment and width fields, decimal precision, type representations, thousand separators, sign handling, dynamic specifiers, and percentage formatting.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 12:00 PM UTC

Quiz: Structuring Your Python Script

In this quiz, you’ll test your understanding of the video course Structuring Your Python Script.

By working through this quiz, you’ll revisit how to make a Python script executable with a shebang, organize your imports per PEP 8, automatically sort imports with ruff, and define a clear entry point using if __name__ == "__main__".

These habits help you transition from quick experiments in the REPL to writing Python scripts that are easy to read, share, and grow.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 12:00 PM UTC


Python Software Foundation

No Starch Press Humble Bundle: Grab a Deal and Support the PSF!

Curious about leveling up your Python skills, or just getting your feet wet? Pick up a whole set of solid Python books at a great price and support the Python Software Foundation (PSF) at the same time!

No Starch Press, an indie tech-book publisher and long time supporter of the PSF, just announced a new Python-themed Humble Bundle. Grab ‘Python: The Good Stuff by No Starch’ and pay what you want for all-Python DRM-free ebook titles for Python beginners to pros. And a share of the proceeds from the bundle goes to the PSF! This bundle runs now through June 18th, 2026, so make sure to grab it and share the link with your friends.

‘Python: The Good Stuff by No Starch’ includes 15 titles for $36 USD ($583 value đŸ«š), including Automate the Boring Stuff with Python, 3rd Edition (Al Sweigart), Python Crash Course, 3rd Edition (Eric Matthes), and Practical Deep Learning (Ronald T. Kneusel).

Humble Bundle Pro Tips: 


Make sure to grab this awesome bundle of Python books for yourself (or a friend!), and help support the PSF. Thank you, No Starch and Humble Bundle, for making Python education more accessible and supporting the PSF. Happy reading, everyone!

About the Python Software Foundation

The Python Software Foundation is a US non-profit whose mission is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers. The PSF supports the Python community using corporate sponsorships, grants, and donations. Are you interested in sponsoring or donating to the PSF so we can continue supporting Python and its community? Check out our sponsorship program, donate directly, or contact our team at sponsors@python.org!

June 02, 2026 07:21 AM UTC


Tryton News

Tryton News June 2026

In the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues - building on the changes from our last release. We also added some new features which we would like to introduce to you in this newsletter.

For an in depth overview of the Tryton issues please take a look at our issue tracker or see the issues and merge requests filtered by label.

Changes for the User

Accounting, Invoicing and Payments

We now add an optional journal column on the invoice list view.

Now we add a relate to the invoice model from the period and fiscal year to be able to export or print invoices per period.

We add a delay to the PEPPOL e-document rendering and processing for each service to allow after posting an invoice to record payments which are later rendered in the UBL invoice.

We now raise a generic user error message when failing to parse an imported AEB43 account statement.

Stock, Production and Shipments

Now we can manage products directly in the category form. So we think it is better to now have dedicated views at all but to ensure that we can manage such large Many2Many (also with #14782 (closed)).

Now we let Tryton calculate average lead time for product suppliers based on the effective date of incoming stock moves and the purchase date of the last year.

Parties

Now we make Tryton try to guess the type of contact mechanism when changing value for the standardised types like email, phone, mobile and URL.

User Interface

We now use the search dialogue popup window for deleting records in One2Many or removing records from Many2Many widgets. The remove (delete) button shows a search popup when no records are selected or when more than 20 records are selected. In the search popup are the identical records preselected. Users can refine the search using the filter and the sort order of the popup. And once the popup is validated, the selected records are removed (deleted) from the X2Many field.

We now display the number of records being deleted in the confirmation message. We think it helps the user to realise that they are deleting many records.

Now we allow users to mark notifications as read.

System Data and Configuration

Now we support the country organization (Like EU, ASEAN, 
) as a criteria for tax rules.

New Releases

We released bug fixes for the currently maintained long term support series
8.0 and 7.0, and for the penultimate series 7.8.

There are no new release for 6.0 and 7.6 series as they entered their end of life period.

Changes for the System Administrator

We now remove the dependencies to pytz and backports.entry-points-selectable.

Now we update the version of Stripe to 2026-04-22.dahlia.

Changes for Implementers and Developers

We now add support for the age-functionality to SQLite. The age-function returns a time interval instead of an integer (of days) when calculating duration between dates.

Authors: @pokoli @udono

1 post - 1 participant

Read full topic

June 02, 2026 06:00 AM UTC


Python Insider

Python 3.15.0 beta 2 is here!

The antepenultimate 3.15 beta is out!

June 02, 2026 12:00 AM UTC

June 01, 2026


The No TitleÂź Tech Blog

Just updated - both Optimize Images and Optimize Images X

This release represents a significant milestone for both Optimize Images and Optimize Images X, marking a coordinated step forward in modernization, dependency cleanup, and internal architecture improvements across the ecosystem.

June 01, 2026 09:40 PM UTC


death and gravity

DynamoDB crash course: part 3 – design patterns

Previously

This is the last part of a series covering core DynamoDB concepts. The goal is to help you understand idiomatic usage and trade-offs in under an hour.

In the first part, I summarized DynamoDB's main proposition to its users like so:

data modeling complexity is always preferable to complexity coming from infrastructure maintenance, availability, and scalability

Today, we're looking at the design patterns that help manage this complexity, making the most of its data model and features and working around its limits.

Contents

Composite keys #

Composite (aka synthetic) keys underpin most other patterns.

The idea is simple: keys don't have to be natural attributes of your data, they can be composed of other attributes that enable specific access patterns. This works both with table and index keys.

How do you compose keys? By string concatenation, of course! Careful with numbers though, they need padding to be useful in sort keys.

Example

To sort lexicographically by more than one attribute, you group them in a sort key, e.g. {Album}#{Song}.

Or, in single table design, you distinguish between item types by prefixing keys with the type, e.g. album#{Album}.

Or, in partition key sharding, you spread the load on a GSI partition by splitting one partition key into multiple ones, e.g. {Genre}#{shard}.

But denormalization has its trade-offs. For sort key {Album}#{Song}, should Album and Song also be separate attributes? If yes, you need to ensure they never change, but you can use them in indexes (e.g. a GSI with Album as primary key). If no, items can't become inconsistent, but you need to parse the key to get them.

This was inconvenient enough that DynamoDB finally added multi-attribute keys support to GSIs in 2025 (although not inconvenient enough to also add it to tables).

See also

Single table design #

The AWS guidance is to use as few tables as possible:

As a general rule, you should maintain as few tables as possible in a DynamoDB application. [...] A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.

This culminates in single table design, where you put all entities in the same table, and tell them apart based on the key format, usually using a prefix. With this pattern, one DynamoDB table corresponds to a whole relational database.

The easiest way is to put items related to a top-level entity on the same partition. The main benefit is that joins with the top-level entity become trivial. A second one is that you can sometimes get different entity types in a single query, which can be both faster and cheaper (fewer queries; small items pack into fewer capacity units).

Example

You can group items related to an Artist on the same partition, with sort keys like artist, album#{Album}, and song#{Album}#{Song}.

# table Music (partition key: Artist, sort key: sk)
Solar Fields: !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
  'song#Leaving Home#Air Song': { Duration: 741 }
  'song#Leaving Home#Monogram': { Duration: 944 }

Besides getting items of a single type, you can also get artist details and albums in a single query (sk BETWEEN "album#" AND "artist").

But choose wisely – queries can have only one sort key condition, so you can't also get album details and songs in a single query with this schema; sort keys {Album} and {Album}#{Song} would do it, at the expense of the first query.

Sometimes, it can be useful to put some sub-entities on dedicated partitions, accepting that joins will have to be done in code.

Example

In the example above, a popular artist with lots of songs can lead to:

Perhaps it's better to put the songs in each album on separate partitions:

# table Music (partition key: pk, sort key: sk)
'artist#Solar Fields': !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
'song#Solar Fields#Leaving Home': !btree
  'Air Song': { Duration: 741 }
  'Monogram': { Duration: 944 }

This spreads the load onto multiple partitions, which should fix throttling.

The downside is that list songs for artist is now a two-step operation: first one query for the albums, then one query per album for the songs. The upside is that the per-album queries can be done in parallel, which wasn't possible before.

A consequence of this design is that you need a GSI to list items of a specific type (otherwise, you have to do a full table scan). Of note, exceeding the GSI partition throughput limit will cause write throttling on the base table; in the absence of a natural high-cardinality GSI partition key, sharding or some other composite key can help.

A final benefit of using a single table is better utilization with provisioned mode: usage gets averaged across entities and tends to be smoother, and spikes can share the same spare capacity.

See also

GSI overloading #

GSI overloading is just single table design for indexes – you put different values in the GSI key attributes, depending on item type. This way you can index more attributes than the 20 GSIs per table quota, and it can be cheaper too, since, like with tables, fewer indexes make better use of spare provisioned capacity.

Example

For a table that contains both artist and album items, a single GSI can be used for entirely different purposes:

# table Music (partition key: Artist, sort key: sk)
2 Bit Pie: !btree
  'album#2 Pie Island': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#United Kingdom' }
Ishome: !btree
  'album#Confession': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#Russia' }
# GSI GSI1 (partition key: gsi1pk, sort key: Artist)
'artist#United Kingdom': !btree
  2 Bit Pie: { sk: 'artist' }
'artist#Russia': !btree
  Ishome: { sk: 'artist' }
'album#Electronic': !btree
  2 Bit Pie: { sk: 'album#2 Pie Island' }
  Ishome: { sk: 'album#Confession' }

See also

Partition key sharding #

Sometimes, a partition key composed of multiple natural attributes is not enough to spread the load evenly across partitions; you can deal with this by putting items with the same natural attributes on multiple partitions.

So, what partition key should you use? One option is to use a random suffix from a known range; this allows you to list items for a natural attribute value by doing multiple queries, one for each suffix.

Example

For a table of songs, using Album as the partition key won't work, since not all songs are released on an album; Artist always has a value, but some artists have hundreds or even thousands of songs, which can lead to throttling.

Instead, we can use {Artist}#{randrange(10)} as partition key, which allows ten times as many items before we reach throughput limits. To list an artist's songs:

for shard in range(10):
    for item in dynamodb.query(f"{artist}#{shard}"):
        yield item

A downside of random suffixes is that you can't get a specific item, because you don't know what its suffix is. A better option is to calculate the suffix from an attribute that you do know, for example using its hash modulo N.

Example

With primary key {Artist}#{hash(Song) % 10)}, we can get a song like this:

def hash(s):
    return int.from_bytes(sha256(s.encode()).digest())

shard = hash(song_title) % 10
dynamodb.get_item(f"{artist}#{shard}", song_title)

A lot of times you need to list items by a low-cardinality attribute, so sharding may be even more important for GSIs.

Example

Assuming dedicated album items, you can list all the albums by putting them in a single GSI partition key called albums, but this will definitely cause throttling.

To avoid it, you can use GSI partition key album#{hash(Album} % 100} if you don't care about the order, or something like album#{Album[:2].lower()} if you do (but likely more sophistication is needed – th will be a very common album title prefix, and some album titles don't contain letters at all).

Even if throttling is not an issue (e.g. single infrequent reader), sharding allows you to query multiple partitions in parallel, which can speed up getting the entire result set.


So, how many shards should you have? That depends on the number, size, and how often you access the items, and is also a trade-off – too many shards means additional queries and latency, too few shards means you still overload the partitions sometimes.

Importantly, increasing the number of shards is non-trivial. For tables, you usually need to rebalance the items in place. For indexes, it's cleaner to move to a new index, or if you just need to list items by type, you can put all new items on new shards.

Regardless, you have to support it in code, do a backfill, and orchestrate the migration, which all become more complex if downtime and inconsistencies are not acceptable (e.g. if you expose a pagination token based on LastEvaluatedKey, you may want to support both versions during the switch).

See also

Sparse indexes #

An item with missing index partition/sort key attributes won't appear in the index, and you won't pay for it. This can be used deliberately to query a subset of the items in the table, like those of a specific type or in a specific state.

Example

Assuming dedicated album items, an alternative way to list all the albums is to have a GSI with {Album} as partition key, and just scan the entire index (the primary key has to be a dedicated attribute that only albums have, so that only album items appear in the index).

Or, you can use a dedicated GSI with CoverOf as primary key to list cover songs.

See also

Base table indexes #

In some cases, GSIs won't cut it – maybe you need a strongly consistent index, or need to model a many-to-one relationship (indexes map one item in the base table to one item in the index).

Instead, you can maintain an index in the base table by having additional index items associated with the main item; to guarantee atomic updates, use transactions. You then go from the main item to the index items via a main item attribute, and from the index items to the main item via their partition key.

Example

Songs have different identifiers in external systems, such as ISRC, ISWC, or MBID. To query songs by multiple external ids, you'd structure your database like this:

(Alternatively, you could have one sparse index per external id type, but then you lose strong consistency, and risk running out of GSIs).

Note that modeling one-to-many relationships isn't this involved, since it fits neatly into the related-items-same-partition variant of single table design.

See also

Optimistic locking #

Optimistic locking is a concurrency control method useful when conflicts are rare, so instead of acquiring a lock to do changes, you check if someone else changed the data right before commiting, as part of an atomic operation.

In DynamoDB, that operation is a conditional write; items get an integer version attribute, and every time you want to update an item, you:

  1. read the item, including the version
  2. increment the version and modify the item
  3. update the item, using a condition expression to ensure the version matches
    1. if successful, you're done
    2. else, start over from the beginning

You can also do this in transactions to update groups of related items, like in the base table index pattern above, with only the main item needing a version.

The upside of optimistic locking is that it is faster on average, since updates usually succeed on the first try; for fewer conflicts, use strongly consistent reads.

The downside is that it requires explicit support – it must be possible to start over from the beginning, which complicates logic, especially if you need to interact with other systems besides updating the item (e.g. to send a notification).

See also


Anyway, that's it for now.

See also

For mode details and examples, check out the official documentation:

Learned something new today? Share it with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

June 01, 2026 03:00 PM UTC


Real Python

Python sleep(): How to Add Time Delays to Your Code

Sometimes you need to make Python sleep, wait, or pause before running the next line of code. Whether you’re spacing out API requests, pacing a thread, or adding a delay to terminal output, Python’s time.sleep() function is the standard tool:

Language: Python
from time import sleep
sleep(3)  # Pause execution for 3 seconds

Beyond time.sleep(), Python provides different ways to add time delays depending on the context, including threads, async code, and GUI applications.

By the end of this tutorial, you’ll understand that:

  • time.sleep() suspends execution for a given number of seconds, including fractional values like milliseconds.
  • Retry decorators use time.sleep() to add a delay between failed attempts.
  • Event.wait() is the preferred way to add delays in threads because it can be interrupted cleanly.
  • asyncio.sleep() pauses a single coroutine without blocking the rest of your async code.
  • GUI frameworks like Tkinter provide scheduling methods such as .after() to avoid freezing the event loop.

The following sections cover each of these approaches with working code examples.

Get Your Code: Click here to download the free sample code you’ll use to add time delays to scripts, threads, async code, and GUI apps.

Take the Quiz: Test your knowledge with our interactive “Python time.sleep()” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Python time.sleep()

In this quiz, you'll revisit how to add time delays to your Python programs.

Pause Execution With Python sleep()

Python has built-in support for making your program wait. The time module has a sleep() function that you can use to add a delay by suspending execution of the calling thread for the number of seconds you specify:

Language: Python
>>> import time
>>> time.sleep(3)  # Sleep for 3 seconds

Here’s a quick example of time.sleep() in action:

Language: Python Filename: coffee.py
import time

print("Brewing coffee...")
print("This would take like 3 secs...")
time.sleep(3)
print("Done! Your coffee is ready!")

If you run this script, you’ll see a three-second pause between the messages while time.sleep() suspends execution.

You can also pass fractional seconds to time.sleep() for finer-grained durations. Here are some common values:

Language: Python
import time

time.sleep(0.5)  # Wait 500 milliseconds
time.sleep(0.001)  # Wait 1 millisecond
time.sleep(1.5)  # Wait 1.5 seconds
time.sleep(60)  # Wait 1 minute

The time.sleep() function isn’t perfectly precise. The specified value acts as a minimum delay. The actual pause will almost always be slightly longer in practice due to operating system scheduler overhead and current system load.

You can test how long the sleep lasts by using Python’s timeit module:

Language: Shell
$ python -m timeit -n 3 "import time; time.sleep(3)"
3 loops, best of 5: 3 sec per loop

Here, you run the timeit module with the -n parameter, which tells timeit how many times to run the statement per repeat. With the default of five repeats, the statement runs 15 times in total (3 × 5). timeit then reports the best time across all repeats, which is three seconds per loop, as expected.

For a more realistic example, say you need to monitor whether a website is up. You want to check its status code periodically, but querying the server too often could overload it or get you rate-limited. You can use time.sleep() to space out the checks:

Language: Python Filename: uptime_bot.py
import time
import urllib.request
import urllib.error

CHECK_INTERVAL = 60  # Seconds between checks

def uptime_bot(url):
    while True:
        try:
            urllib.request.urlopen(url)
        except urllib.error.HTTPError as e:
            # Email admin or log
            print(f"HTTPError: {e.code} for {url}")
        except urllib.error.URLError as e:
            # Email admin or log
            print(f"URLError: {e.reason} for {url}")
        else:
            # Website is up
            print(f"{url} is up")
        time.sleep(CHECK_INTERVAL)

if __name__ == "__main__":
    url = "https://www.google.com/py"
    uptime_bot(url)

Read the full article at https://realpython.com/python-sleep/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 01, 2026 02:00 PM UTC

Quiz: Regular Expressions: Regexes in Python (Part 1)

In this quiz, you’ll test your understanding of Regular Expressions: Regexes in Python (Part 1).

By working through this quiz, you’ll revisit how to use the re module to search for patterns, build character classes and anchors, group and capture substrings, and apply flags like re.IGNORECASE to control matching behavior.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 01, 2026 12:00 PM UTC