skip to navigation
skip to content

Planet Python

Last update: March 05, 2026 07:43 PM UTC

March 05, 2026


Real Python

Quiz: How to Use the OpenRouter API to Access Multiple AI Models via Python

In this quiz, you’ll test your understanding of How to Use the OpenRouter API to Access Multiple AI Models via Python.

By completing this quiz, you’ll review how OpenRouter provides a unified routing layer, how to call multiple providers from a single Python script, how to switch models without changing your code, and how to compare outputs.

It also reinforces practical skills for making API requests in Python, handling authentication, and processing responses. For deeper guidance, review the tutorial linked above.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 05, 2026 12:00 PM UTC


scikit-learn

Update on array API adoption in scikit-learn

Author: Author IconLucy Liu

Note: this blog post is a cross-post of a Quansight Labs blog post.

The Consortium for Python Data API Standards developed the Python array API standard to define a consistent interface for array libraries, specifing core operations, data types, and behaviours. This enables ‘array-consuming’ libraries (such as scikit-learn) to write array-agnostic code that can be run on any array API compliant backend. Adopting array API support in scikit-learn means that users can pass arrays from any array API compliant library to functions that have been converted to be array-agnostic. This is useful because it allows users to take advantage of array library features, such as hardware acceleration, most notably via GPUs.

Indeed, GPU support in scikit-learn has been of interest for a long time - 11 years ago, we added an entry to our FAQ page explaining that we had no plans to add GPU support in the near future due to the software dependencies and platform specific issues it would introduce. By relying on the array API standard, however, these concerns can now be avoided.

In this blog post, I will provide an update to the array API adoption work in scikit-learn, since it’s initial introduction in version 1.3 two years ago. Thomas Fan’s blog post provides details on the status when array API support was initially added.

Current status

Since the introduction of array API support in version 1.3 of scikit-learn, several key developments have followed.

Vendoring array-api-compat and array-api-extra

Scikit-learn now vendors both array-api-compat and array-api-extra. array-api-compat is a wrapper around common array libraries (e.g., PyTorch, CuPy, JAX) that bridges gaps to ensure compatibility with the standard. It enables adoption of backwards incompatible changes while still allowing array libraries time to adopt the standard slowly. array-api-extra provides array functions not included in the standard but deemed useful for array-consuming libraries.

We chose to vendor these now much more mature libraries in order to avoid the complexity of conditionally handling optional dependencies throughout the codebase. This approach also follows precedent, as SciPy also vendors these packages.

Array libraries supported

Scikit-learn currently supports CuPy ndarrays, PyTorch tensors (testing against all devices: ‘cpu’, ‘cuda’, ‘mps’ and ‘xpu’) and NumPy arrays. JAX support is also on the horizon. The main focus of this work is addressing in-place mutations in the codebase. Follow PR #29647 for updates.

Beyond these libraries, scikit-learn also tests against array-api-strict, a reference implementation that strictly adheres to the array API specification. The purpose of array-api-strict is to help automate compliance checks for consuming libraries and to enable development and testing of array API functionality without the need for GPU or other specialized hardware. Array libraries that conform to the standard and pass the array-api-tests suite should be accepted by scikit-learn and SciPy, without any additional modifications from maintainers.

Estimators and metrics with array API support

The full list of metrics and estimators that now support array API can be found in our Array API support documentation page. The majority of high impact metrics have now been converted to be array API compatible. Many transformers are also now supported, notably LabelBinarizer which is widely used internally and simplifies other conversions.

Conversion of estimators is much more complicated as it often involves benchmarking different variations of code or consensus gathering on implementation choices. It generally requires many months of work by several maintainers. Nonetheless, support for LogisticRegression, GaussianNB, GaussianMixture, Ridge (and family: RidgeCV, RidgeClassifier, RidgeClassifierCV), Nystroem and PCA has been added. Work on GaussianProcessRegressor is also underway (follow at PR #33096).

Handling mixed array namespaces and devices

scikit-learn takes a unique approach among ‘array-consuming’ libraries by supporting mixed array namespace and device inputs. This design choice enables the framework to handle the practical complexities of end-to-end machine learning pipelines.

String-valued class labels are common in classification tasks and enable users to work with interpretable categories rather than integer codes. NumPy is currently the only array library with string array support, meaning that any workflow involving both GPU-accelerated computation and string labels necessarily involves mixed array type inputs.

Mixed array input support also enables flexible pipeline workflows. Pipelines provide significant value by chaining preprocessing steps and estimators into reusable workflows that prevent data leakage and ensure consistent preprocessing. However, they have an intentional design limitation: pipeline steps can transform feature arrays (X) but cannot modify target arrays (y). Allowing mixed array inputs means a pipeline can include a FunctionTransformer step that moves feature data from CPU to GPU to leverage hardware acceleration, while allowing the target array, which cannot be modified, to remain on CPU.

For example, mixed array inputs enable a pipeline where string classification features are encoded on CPU (as only NumPy supports string arrays), converted to torch CUDA tensors, then passed to the array API-compatible RidgeClassifier for GPU-accelerated computation:

from functools import partial

from sklearn.linear_model import RidgeClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer, TargetEncoder

pipeline = make_pipeline(
    # Encode string categories with average target values
    TargetEncoder(),
    # Convert feature array `X` to Torch CUDA device
    FunctionTransformer(partial(torch.asarray, dtype="float32", device="cuda"))
    RidgeClassifier(solver="svd"),
)

Work on adding mixed array type inputs for metrics and estimators is underway and expected to progress quickly. This work includes developing a robust testing framework, including for pipelines using mixed array types (follow PR #32755 for details).

Finally, we have also revived our work to support the ability to fit and predict on different namespaces/devices. This allows users to train models on GPU hardware but deploy predictions on CPU hardware, optimizing costs and accommodating different resource availability between training and production environments. Follow PR #33076 for details.

Challenges

The challenges of array API adoption remain largely unchanged from when this work began. These are also common to other array-consuming libraries, with a notable addition: the need to handle array movement between namespaces and devices to support mixed array type inputs.

Array API Standard is a subset of NumPy’s API

The array API standard only includes widely-used functions implemented across most array libraries, meaning many NumPy functions are absent. When such a function is encountered while adding array API support, we have the following options:

The quantile function illustrates this decision-making process. quantile is not included in the standard as it is not widely used (outside of scikit-learn) and while it is implemented in most array libraries, the set of quantile methods supported and their APIs vary. Currently, scikit-learn maintains its own array API compatible version that supports both weights and NaNs, but due to the maintenance burden we decided to investigate alternatives. SciPy has an array API compatible implementation, but it did not support weights. We thus investigated adding quantile to array-api-extra; however, during this effort, SciPy decided to add weight support. Thus, we ultimately decided to transition to the SciPy implementation once our minimum SciPy version allows.

Compiled code

Many performance-critical parts of scikit-learn are written using compiled code extensions in Cython, C or C++. These directly access the underlying memory buffers of NumPy arrays and are thus restricted to CPU.

Metrics and estimators, with compiled code, handle this in one of two ways: convert arrays to NumPy first or maintain two parallel branches of code, one for NumPy (compiled) and one for other array types (array API compatible). When performance is less critical or array API conversion provides no gains (e.g., confusion_matrix), we convert to NumPy. When performance gains are significant, we accept the maintenance burden of dual code paths. This was the case for LogisticRegression and the extensive process required for making such implementation decisions can be seen in the PR #32644.

Unspecified behaviour in the standard

The array API standard intentionally leaves some function behaviors unspecified, permitting implementation differences across array libraries. For example, the order of unique elements is not specified for the unique_* functions and as of NumPy version 2.3, some unique_* functions no longer return sorted values. This will require code amendments in cases where sorted output was relied upon.

Similarly, NaN handing is also unspecified for sort; however, in this case, all array libraries currently supported by scikit-learn follow NumPy’s NaN semantics, placing NaNs at the end. This consistency eliminates the need for special handling code, though comprehensive testing remains essential when adding support for new array libraries.

Device transfer

Mixed array namespace and device inputs necessitates conversion of arrays between different namespaces and devices. This presented a number of considerations and challenges.

The array API standard adopted DLPack as the recommended data interchange protocol. This protocol is widely implemented in array libraries and offers an efficient, C ABI compatible protocol for array conversion. While this provided us with an easy way to implement these transfers, there were limitations. Cross-device transfer capability was only introduced in DLPack v1, released in September 2024. This meant that only the latest PyTorch and CuPy versions have support for DLPack v1. Moreover, not all array libraries have adopted support yet. We therefore implemented a ‘manual’ fallback; however, this requires conversion via NumPy when the transfer involves two non-NumPy arrays. Additionally, there are no DLPack tests in array-api-tests, a testing suite to verify standard compliance, leaving DLPack implementation bugs easier to overlook. Despite these challenges, scikit-learn will benefit from future improvements, such as addition of a C-level API for DLPack exchange that bypasses Python function calls, offering significant benefit for GPU applications.

Beyond the technical considerations, there were also user interface considerations. How should we inform users that these conversions, which incur memory and performance cost, are occurring? We decided against warnings, which risk being ignored or becoming a nuisance, and to instead clearly document this behaviour. Additionally, different devices have different data type limitations; for example, Apple MPS only supports float32. How best to handle these differences when performing conversions while ensuring users are informed of precision impacts is an ongoing consideration.

A quick benchmark

Array API support for Ridge regression was added in version 1.5, enabling GPU-accelerated linear models in scikit-learn. Combined with support of several transformers, this allows for complete preprocessing and estimation pipelines on GPU.

The following benchmark shows the use of the MaxAbsScaler transformed followed by Ridge regression using randomly generated data with 500,000 samples and 300 features. The benchmarks were run on AMD Ryzen Threadripper 2970WX CPU, NVIDIA Quadro RTX 8000 GPU and Apple M4 GPU (Metal 3).

The figure below shows the performance speed up on CuPy, Torch CPU and Torch GPU relative to NumPy.

Benchmarking of MaxAbsScaler/Ridge pipeline

Performance speedup relative to NumPy across different backends.

The observed speedups are representative of performance gains achievable with sufficiently large datasets on datacenter-grade GPUs for linear algebra-intensive workloads. Mobile GPUs, such as those in laptops, would typically yield more modest improvements.

Note that scikit-learn’s Ridge regressor currently only supports ‘svd’ solver. We selected this solver for initial implementation as it exclusively uses standard-compliant functions available across all backends and is the most stable solver. Support for the ‘cholesky’ solver is also underway (see details in PR #29318).

Looking forward

As of version 1.8, array API support is still in experimental mode and thus not enabled by default. However, we welcome early adopters and interested users to try it and report any issues. See our documentation for details on enabling array API support.

Before removing experimental status, we would like to:

Alongside these infrastructure and framework improvements, we look forward to adding support for more estimators. These improvements will deliver production-ready GPU support and flexible deployment options to scikit-learn users. We welcome community involvement through testing and feedback throughout this development phase.

Acknowledgements

Work on array API in scikit-learn has been a combined effort from many contributors. This work was partly funded by CZI and NASA Roses.

I would like to thank Olivier Grisel, Tim Head and Evgeni Burovski for helping me with my array API questions.

March 05, 2026 12:00 AM UTC


Armin Ronacher

AI And The Ship of Theseus

Because code gets cheaper and cheaper to write, this includes re-implementations. I mentioned recently that I had an AI port one of my libraries to another language and it ended up choosing a different design for that implementation. In many ways, the functionality was the same, but the path it took to get there was different. The way that port worked was by going via the test suite.

Something related, but different, happened with chardet. The current maintainer reimplemented it from scratch by only pointing it to the API and the test suite. The motivation: enabling relicensing from LGPL to MIT. I personally have a horse in the race here because I too wanted chardet to be under a non-GPL license for many years. So consider me a very biased person in that regard.

Unsurprisingly, that new implementation caused a stir. In particular, Mark Pilgrim, the original author of the library, objects to the new implementation and considers it a derived work. The new maintainer, who has maintained it for the last 12 years, considers it a new work and instructs his coding agent to do precisely that. According to author, validating with JPlag, the new implementation is distinct. If you actually consider how it works, that’s not too surprising. It’s significantly faster than the original implementation, supports multiple cores and uses a fundamentally different design.

What I think is more interesting about this question is the consequences of where we are. Copyleft code like the GPL heavily depends on copyrights and friction to enforce it. But because it’s fundamentally in the open, with or without tests, you can trivially rewrite it these days. I myself have been intending to do this for a little while now with some other GPL libraries. In particular I started a re-implementation of readline a while ago for similar reasons, because of its GPL license. There is an obvious moral question here, but that isn’t necessarily what I’m interested in. For all the GPL software that might re-emerge as MIT software, so might be proprietary abandonware.

For me personally, what is more interesting is that we might not even be able to copyright these creations at all. A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.

But this all causes some interesting new developments we are not necessarily ready for. Vercel, for instance, happily re-implemented bash with Clankers but got visibly upset when someone re-implemented Next.js in the same way.

There are huge consequences to this. When the cost of generating code goes down that much, and we can re-implement it from test suites alone, what does that mean for the future of software? Will we see a lot of software re-emerging under more permissive licenses? Will we see a lot of proprietary software re-emerging as open source? Will we see a lot of software re-emerging as proprietary?

It’s a new world and we have very little idea of how to navigate it. In the interim we will have some fights about copyrights but I have the feeling very few of those will go to court, because everyone involved will actually be somewhat scared of setting a precedent.

In the GPL case, though, I think it warms up some old fights about copyleft vs permissive licenses that we have not seen in a long time. It probably does not feel great to have one’s work rewritten with a Clanker and one’s authorship eradicated. Unlike the Ship of Theseus, though, this seems more clear-cut: if you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship. It only continues to carry the name. Which may be another argument for why authors should hold on to trademarks rather than rely on licenses and contract law.

I personally think all of this is exciting. I’m a strong supporter of putting things in the open with as little license enforcement as possible. I think society is better off when we share, and I consider the GPL to run against that spirit by restricting what can be done with it. This development plays into my worldview. I understand, though, that not everyone shares that view, and I expect more fights over the emergence of slopforks as a result. After all, it combines two very heated topics, licensing and AI, in the worst possible way.

March 05, 2026 12:00 AM UTC

March 04, 2026


PyCharm

Cursor is now available as an AI agent inside JetBrains IDEs through the Agent Client Protocol. Select it from the agent picker, and it has full access to your project.

If you’ve spent any time in the AI coding space, you already know Cursor. It has been one of the most requested additions to the ACP Registry.

What you get

Cursor is known for its AI-native, agentic workflows. JetBrains IDEs are valued for deep code intelligence – refactoring, debugging, code quality checks, and the tooling professionals rely on at scale. ACP brings the two together.

You can now use Cursor’s agentic capabilities directly inside your JetBrains IDE – within the workflows and features you already use. 

A growing open ecosystem

Cursor joins a growing list of agents available through ACP in JetBrains IDEs. Every new addition to the ACP Registry means you have more choice – while still working inside the IDE you already rely on. You get access to frontier models from major providers, including OpenAI, Anthropic, Google, and now also Cursor.

This is part of our open ecosystem strategy. Plug in the agents you want and work in the IDE you love – without getting locked into a single solution.

Cursor is focused on building the best way to build software with AI. By integrating Cursor with JetBrains IDEs, we’re excited to provide teams with powerful agentic capabilities in the environments where they’re already working.

– Jordan Topoleski, COO at Cursor

Get started

You need version 2025.3.2 or later of your JetBrains IDE with the AI Assistant plugin enabled. From there, open the agent selector, select Install from ACP Registry
, install Cursor, and start working. You don’t need a JetBrains AI subscription to use Cursor as an AI agent.

The ACP Registry keeps growing, and many agents have already joined it – with more on the way. Try it today with Cursor and experience agent-driven development inside your JetBrains IDE. For more information about the Agent Client Protocol, see our original announcement and the blog post on the ACP Agent Registry support.

March 04, 2026 04:41 PM UTC


Python Morsels

Invent your own comprehensions in Python

Python doesn't have tuple, frozenset, or Counter comprehensions, but you can invent your own by passing a generator expression to any iterable-accepting callable.

Table of contents

  1. Generator expressions pair nicely with iterable-accepting callables
  2. Tuple comprehensions
  3. frozenset comprehensions
  4. Counter comprehensions
  5. Aggregate with reducer functions
  6. Invent your own comprehensions with generator expressions

Generator expressions pair nicely with iterable-accepting callables

Generator expressions work really nicely with Python's any and all functions:

>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> any(n > 1 for n in numbers)
True
>>> all(n > 1 for n in numbers)
False

In fact, I rarely see any and all used without a generator expression passed to them.

Note that generator expressions are made with parentheses:

>>> (n**2 for n in numbers)
<generator object <genexpr> at 0x74c535589b60>

But when a generator expression is the sole argument passed into a function:

>>> all((n > 1 for n in numbers))
False

The double set of parentheses (one to form the generator expression and one for the function call) can be turned into just a single set of parentheses:

>>> all(n > 1 for n in numbers)
False

This special allowance was added to Python's syntax because it's very common to see generator expressions passed in as the sole argument to specific functions.

Note that passing generator expressions into iterable-accepting functions and classes makes something that looks a bit like a custom comprehension. Every iterable-accepting function/class is a comprehension-like tool waiting to happen.

Tuple comprehensions

Python does not have tuple 


Read the full article: https://www.pythonmorsels.com/custom-comprehensions/

March 04, 2026 03:30 PM UTC


PyCharm

Cursor Joined the ACP Registry and Is Now Live in Your JetBrains IDE

March 04, 2026 03:28 PM UTC


Real Python

How to Use the OpenRouter API to Access Multiple AI Models via Python

One of the quickest ways to call multiple AI models from a single Python script is to use OpenRouter’s API, which acts as a unified routing layer between your code and multiple AI providers. By the end of this guide, you’ll access models from several providers through one unified API, as shown in the image below:

Open Router Running Multiple AI ModelsOpenRouter Unified API Running Multiple AI Models

This convenience matters because the AI ecosystem is highly fragmented: each provider exposes its own API, authentication scheme, rate limits, and model lineup. Working with multiple providers often requires additional setup and integration effort, especially when you want to experiment with different models, compare outputs, or evaluate trade-offs for a specific task.

OpenRouter gives you access to thousands of models from leading providers such as OpenAI, Anthropic, Mistral, Google, and Meta. You switch between them without changing your application code.

Get Your Code: Click here to download the free sample code that shows you how to use the OpenRouter API to access multiple AI models via Python.

Take the Quiz: Test your knowledge with our interactive “How to Use the OpenRouter API to Access Multiple AI Models via Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

How to Use the OpenRouter API to Access Multiple AI Models via Python

Test your Python skills with OpenRouter: learn unified API access, model switching, provider routing, and fallback strategies.

Prerequisites

Before diving into OpenRouter, you should be comfortable with Python fundamentals like importing modules, working with dictionaries, handling exceptions, and using environment variables. If you’re familiar with these basics, the first step is authenticating with OpenRouter’s API.

Step 1: Connect to OpenRouter’s API

Before using OpenRouter, you need to create an account and generate an API key. Some models require prepaid credits for access, but you can start with free access to test the API and confirm that everything is working.

To generate an API key:

  • Create an account at OpenRouter.ai or sign in if you already have an account.
  • Select Keys from the dropdown menu and create an API key.
  • Fill in the name, something like OpenRouter Testing.
  • Leave the remaining defaults and click Create.

Copy the generated key and keep it secure. In a moment, you’ll store it as an environment variable rather than embedding it directly in your code.

To call multiple AI models from a single Python script, you’ll use OpenRouter’s API. You’ll use the requests library to make HTTP calls, which gives you full control over the API interactions without requiring a specific SDK. This approach works with any HTTP client and keeps your code simple and transparent.

First, create a new directory for your project and set up a virtual environment. This isolates your project dependencies from your system Python installation:

Shell
$ mkdir openrouter-project/
$ cd openrouter-project/
$ python -m venv venv/

Now, you can activate the virtual environment:

Windows PowerShell
PS> venv\Scripts\activate
Shell
$ source venv/bin/activate

You should see (venv) in your terminal prompt when it’s active. Now you’re ready to install the requests package for conveniently making HTTP calls:

Shell
(venv) $ python -m pip install requests

Read the full article at https://realpython.com/openrouter-api/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 04, 2026 02:00 PM UTC


Glyph Lefkowitz

What Is Code Review For?

Humans Are Bad At Perceiving

Humans are not particularly good at catching bugs. For one thing, we get tired easily. There is some science on this, indicating that humans can’t even maintain enough concentration to review more than about 400 lines of code at a time..

We have existing terms of art, in various fields, for the ways in which the human perceptual system fails to register stimuli. Perception fails when humans are distracted, tired, overloaded, or merely improperly engaged.

Each of these has implications for the fundamental limitations of code review as an engineering practice:

Never Send A Human To Do A Machine’s Job

When you need to catch a category of error in your code reliably, you will need a deterministic tool to evaluate — and, thanks to our old friend “alert fatigue” above — ideally, to also remedy that type of error. These tools will relieve the need for a human to make the same repetitive checks over and over. None of them are perfect, but:

Don’t blame reviewers for missing these things.

Code review should not be how you catch bugs.

What Is Code Review For, Then?

Code review is for three things.

First, code review is for catching process failures. If a reviewer has noticed a few bugs of the same type in code review, that’s a sign that that type of bug is probably getting through review more often than it’s getting caught. Which means it’s time to figure out a way to deploy a tool or a test into CI that will reliably prevent that class of error, without requiring reviewers to be vigilant to it any more.

Second — and this is actually its more important purpose — code review is a tool for acculturation. Even if you already have good tools, good processes, and good documentation, new members of the team won’t necessarily know about those things. Code review is an opportunity for older members of the team to introduce newer ones to existing tools, patterns, or areas of responsibility. If you’re building an observer pattern, you might not realize that the codebase you’re working in already has an existing idiom for doing that, so you wouldn’t even think to search for it, but someone else who has worked more with the code might know about it and help you avoid repetition.

You will notice that I carefully avoided saying “junior” or “senior” in that paragraph. Sometimes the newer team member is actually more senior. But also, the acculturation goes both ways. This is the third thing that code review is for: disrupting your team’s culture and avoiding stagnation. If you have new talent, a fresh perspective can also be an extremely valuable tool for building a healthy culture. If you’re new to a team and trying to build something with an observer pattern, and this codebase has no tools for that, but your last job did, and it used one from an open source library, that is a good thing to point out in a review as well. It’s an opportunity to spot areas for improvement to culture, as much as it is to spot areas for improvement to process.

Thus, code review should be as hierarchically flat as possible. If the goal of code review were to spot bugs, it would make sense to reserve the ability to review code to only the most senior, detail-oriented, rigorous engineers in the organization. But most teams already know that that’s a recipe for brittleness, stagnation and bottlenecks. Thus, even though we know that not everyone on the team will be equally good at spotting bugs, it is very common in most teams to allow anyone past some fairly low minimum seniority bar to do reviews, often as low as “everyone on the team who has finished onboarding”.

Oops, Surprise, This Post Is Actually About LLMs Again

Sigh. I’m as disappointed as you are, but there are no two ways about it: LLM code generators are everywhere now, and we need to talk about how to deal with them. Thus, an important corollary of this understanding that code review is a social activity, is that LLMs are not social actors, thus you cannot rely on code review to inspect their output.

My own personal preference would be to eschew their use entirely, but in the spirit of harm reduction, if you’re going to use LLMs to generate code, you need to remember the ways in which LLMs are not like human beings.

When you relate to a human colleague, you will expect that:

  1. you can make decisions about what to focus on based on their level of experience and areas of expertise to know what problems to focus on; from a late-career colleague you might be looking for bad habits held over from legacy programming languages; from an earlier-career colleague you might be focused more on logical test-coverage gaps,
  2. and, they will learn from repeated interactions so that you can gradually focus less on a specific type of problem once you have seen that they’ve learned how to address it,

With an LLM, by contrast, while errors can certainly be biased a bit by the prompt from the engineer and pre-prompts that might exist in the repository, the types of errors that the LLM will make are somewhat more uniformly distributed across the experience range.

You will still find supposedly extremely sophisticated LLMs making extremely common mistakes, specifically because they are common, and thus appear frequently in the training data.

The LLM also can’t really learn. An intuitive response to this problem is to simply continue adding more and more instructions to its pre-prompt, treating that text file as its “memory”, but that just doesn’t work, and probably never will. The problem — “context rot” is somewhat fundamental to the nature of the technology.

Thus, code-generators must be treated more adversarially than you would a human code review partner. When you notice it making errors, you always have to add tests to a mechanical, deterministic harness that will evaluates the code, because the LLM cannot meaningfully learn from its mistakes outside a very small context window in the way that a human would, so giving it feedback is unhelpful. Asking it to just generate the code again still requires you to review it all again, and as we have previously learned, you, a human, cannot review more than 400 lines at once.

To Sum Up

Code review is a social process, and you should treat it as such. When you’re reviewing code from humans, share knowledge and encouragement as much as you share bugs or unmet technical requirements.

If you must reviewing code from an LLM, strengthen your automated code-quality verification tooling and make sure that its agentic loop will fail on its own when those quality checks fail immediately next time. Do not fall into the trap of appealing to its feelings, knowledge, or experience, because it doesn’t have any of those things.

But for both humans and LLMs, do not fall into the trap of thinking that your code review process is catching your bugs. That’s not its job.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!

March 04, 2026 05:24 AM UTC


Seth Michael Larson

Relative “Dependency Cooldowns” in pip v26.0 with crontab

WARNING: Most of this blog post is a hack, everyone should probably just wait for relative dependency cooldowns to come to a future version of pip.

pip v26.0 added support for the --uploaded-prior-to option. This new option enables implementing “dependency cooldowns”, a technique described by William Woodruff, that provides simple but effective protections for the relatively short attack-window time of malware published to public software repositories. This brings the reaction time to malware back within the realm of humans, who sometimes need to execute manual triage processes to take down malware from PyPI.

So if you set --uploaded-prior-to to 7 days before this post was published, February 25th, you'd do so like this:

python -m pip install \
  --uploaded-prior-to=2026-02-25 \
  urllib3

But this is only an absolute date, and we have to remember to set the option on each call to pip install? That seems like a lot of work!

Dependency cooldowns work best when the policy can be set in a global configuration file to a relative value like “7 days”. The “window” of acceptable packages is then automatically updating over time without needing to set a new absolute value constantly. “Set-and-forget”-style.

uv allows setting a relative value via --exclude-newer, but pip doesn't support relative ranges yet. I mostly use pip and still wanted to test this feature today, so I created a little hack to update my user pip.conf configuration file on a regular basis instead. Here's what my pip.conf file looks like:

[install]
uploaded-prior-to = 2026-02-25

And below is the entire Python script doing the updating. Quick reminder that I only tested this on my own system, your mileage may vary, do not use in production.

#!/usr/bin/python3
# License: MIT

import datetime
import sys
import os
import re

def main() -> int:
    # Parse the command line options.
    pip_conf = os.path.abspath(os.path.expanduser(sys.argv[1]))
    days = int(sys.argv[2])

    # Load the existing pip.conf file.
    try:
        with open(pip_conf, "r") as f:
            pip_conf_data = f.read()
    except FileNotFoundError:
        print(f"Could not find pip.conf file at: {pip_conf}")
        return 1

    # Update the existing uploaded-prior-to value.
    uploaded_prior_to_re = re.compile(
        r"^uploaded-prior-to\s*=\s*2[0-9]{3}-[0-9]{2}-[0-9]{2}$", re.MULTILINE
    )
    if not uploaded_prior_to_re.search(pip_conf_data):
        print("Could not find uploaded-prior-to option in pip.conf under [install]")
        return 1
    new_uploaded_prior_to = (
        datetime.date.today() - datetime.timedelta(days=days)
    ).strftime("%Y-%m-%d")
    pip_conf_data = uploaded_prior_to_re.sub(
        f"uploaded-prior-to = {new_uploaded_prior_to}", pip_conf_data
    )

    # Write the new uploaded-prior-to
    # value to pip.conf
    with open(pip_conf, "w") as f:
        f.write(pip_conf_data)
    return 0

if __name__ == "__main__":
    sys.exit(main())

The script takes two parameters, your pip.conf file you want to update (typically ~/.config/pip/pip.conf on Linux) and an integer number of days. I used 14 in my cron example below.

Simple right? I installed and chmod u+X-ed the script in my /usr/local/bin directory and then added to my crontab using crontab -u (USERNAME) -e:

0 * * * * (/usr/local/bin/pip-dependency-cooldown /home/sethmlarson/.config/pip/pip.conf 14)  2>&1 | logger -t pip-dependency-cooldown

This pattern will run the script once per hour and update the value of uploaded-prior-to to the new day. Now I only receive packages that were published 14 or more days ago by default when running pip install without any other options.

Stay tuned for more about dependency cooldowns for Python installers once pip supports relative values.



Thanks for keeping RSS alive! ♄

March 04, 2026 12:00 AM UTC

March 03, 2026


PyCoder’s Weekly

Issue #724: Unit Testing Performance, Ordering, FastAPI, and More (March 3, 2026)

#724 – MARCH 3, 2026
View in Browser »

The PyCoder’s Weekly Logo


Unit Testing: Catching Speed Changes

This second post in a series covers how to use unit testing to ensure the performance of your code. This post talks about catching differences in performance after code has changed.
ITAMAR TURNER-TRAURING

Lexicographical Ordering in Python

Python lexicographically orders tuples, strings, and all other sequences, comparing element-by-element. Learn what this means when you compare values or sort.
TREY HUNNER

A Cheaper Heroku? See for Yourself

alt

Is PaaS too expensive for your Django app? We built a comparison calculator that puts the fully-managed hosting options head-to-head →
JUDOSCALE sponsor

Start Building With FastAPI

Learn how to build APIs with FastAPI in Python, including Pydantic models, HTTP methods, CRUD operations, and interactive documentation.
REAL PYTHON course

PEP 743: Add Py_OMIT_LEGACY_API to the Python C API (Rejected)

PYTHON.ORG

DjangoCon US 2026 (Chicago) Call for Proposals Open

DJANGOCON.US

Python Jobs

Python + AI Content Specialist (Anywhere)

Real Python

More Python Jobs >>>

Articles & Tutorials

Serving Private Files With Django and S3

Django’s FileField and ImageField are good at storing files, but on their own they don’t let you control access. Serving files from S3 just makes this more complicated. Learn how to secure a file behind your login wall.
RICHARD TERRY

FastAPI Error Handling: Types, Methods, and Best Practices

FastAPI provides various error-handling mechanisms to help you build robust applications. With built-in validation models, exceptions, and custom exception handlers, you can build robust and scalable FastAPI applications.
HONEYBADGER.IO ‱ Shared by Addison Curtis

CLI Subcommands With Lazy Imports

Python 3.15 will support lazy imports, meaning modules don’t get pulled in until they are needed. This can be particularly useful with Command Line Interfaces where a subcommand doesn’t need everything to be useful.
BRETT CANNON

How the Self-Driving Tech Stack Works

A technical guide to how self-driving cars actually work. CAN bus protocols, neural networks, sensor fusion, and control system with open source implementations, most of which can be accessed through Python.
CARDOG

Managing Shared Data Science Code With Git Submodules

Learn how to manage shared code across projects using Git submodules. Prevent version drift, maintain reproducible workflows, and support team collaboration with practical examples.
CODECUT.AI ‱ Shared by Khuyen Tran

Datastar: Modern Web Dev, Simplified

Talk Python interviews Delaney Gillilan, Ben Croker, and Chris May about the Datastar framework, a library that combines the concepts of HTMX, Alpine, and more.
TALK PYTHON podcast

Introducing the Zen of DevOps

Inspired by the Zen of Python, Tibo has written a Zen of DevOps, applying similar ideas from your favorite language to the world of servers and deployment.
TIBO BEIJEN

Stop Using Pickle Already. Seriously, Stop It!

Python’s Pickle is insecure by design, so using it in public facing code is highly problematic. This article explains why and suggests alternatives.
MICHAL NAZAREWICZ

Raw+DC: The ORM Pattern of 2026?

After 25+ years championing ORMs, Michael has switched to raw database queries paired with Python dataclasses. This post explains why.
MICHAEL KENNEDY

Projects & Code

InvenTree: OSS Inventory Management System

GITHUB.COM/INVENTREE

marimo-jupyter-extension: Integrate Marimo Into JupyterLab

GITHUB.COM/MARIMO-TEAM

py2many: Transpiler of Python to Many Other Languages

GITHUB.COM/PY2MANY

ptapplot: Make Pressure Tap Plots

GITHUB.COM/PAULENORMAN

django-bolt: Rust-Powered API Framework for Django

GITHUB.COM/FARHANALIRAZA

Events

Weekly Real Python Office Hours Q&A (Virtual)

March 4, 2026
REALPYTHON.COM

Python Unplugged on PyTV

March 4 to March 5, 2026
JETBRAINS.COM

Canberra Python Meetup

March 5, 2026
MEETUP.COM

Sydney Python User Group (SyPy)

March 5, 2026
SYPY.ORG

PyDelhi User Group Meetup

March 7, 2026
MEETUP.COM

PyConf Hyderabad 2026

March 14 to March 16, 2026
PYCONFHYD.ORG


Happy Pythoning!
This was PyCoder’s Weekly Issue #724.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

March 03, 2026 07:30 PM UTC


Rodrigo GirĂŁo SerrĂŁo

TIL #140 – Install Jupyter with uv

Today I learned how to install jupyter properly while using uv to manage tools.

Running a Jupyter notebook server or Jupyter lab

To run a Jupyter notebook server with uv, you can run the command

$ uvx jupyter notebook

Similarly, if you want to run Jupyter lab, you can run

$ uvx jupyter lab

Both work, but uv will kindly present a message explaining how it's actually doing you a favour, because it guessed what you wanted. That's because uvx something usually looks for a package named “something” with a command called “something”.

As it turns out, the command jupyter comes from the package jupyter-core, not from the package jupyter.

Installing Jupyter

If you're running Jupyter notebooks often, you can install the notebook server and Jupyter lab with

$ uv tool install --with jupyter jupyter-core

Why uv tool install jupyter fails

Running uv tool install jupyter fails because the package jupyter doesn't provide any commands by itself.

Why uv tool install jupyter-core doesn't work

The command uv tool install jupyter-core looks like it works because it installs the command jupyter correctly. However, if you use --help you can see that you don't have access to the subcommands you need:

$ uv tool install jupyter-core
...
Installed 3 executables: jupyter, jupyter-migrate, jupyter-troubleshoot
$ jupyter --help
...
Available subcommands: book migrate troubleshoot

That's because the subcommands notebook and lab are from the package jupyter. The solution? Install jupyter-core with the additional dependency jupyter, which is what the command uv tool install --with jupyter jupyter-core does.

Other usages of Jupyter

The uv documentation has a page dedicated exclusively to the usage of uv with Jupyter, so check it out for other use cases of the uv and Jupyter combo!

March 03, 2026 03:16 PM UTC


Django Weblog

Django security releases issued: 6.0.3, 5.2.12, and 4.2.29

In accordance with our security release policy, the Django team is issuing releases for Django 6.0.3, Django 5.2.12, and Django 4.2.29. These releases address the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2026-25673: Potential denial-of-service vulnerability in URLField via Unicode normalization on Windows

The django.forms.URLField form field's to_python() method used urllib.parse.urlsplit() to determine whether to prepend a URL scheme to the submitted value. On Windows, urlsplit() performs NFKC normalization (unicodedata.normalize), which can be disproportionately slow for large inputs containing certain characters.

URLField.to_python() now uses a simplified scheme detection, avoiding Unicode normalization entirely and deferring URL validation to the appropriate layers. As a result, while leading and trailing whitespace is still stripped by default, characters such as newlines, tabs, and other control characters within the value are no longer handled by URLField.to_python(). When using the default URLValidator, these values will continue to raise ValidationError during validation, but if you rely on custom validators, ensure they do not depend on the previous behavior of URLField.to_python().

This issue has severity "moderate" according to the Django Security Policy.

Thanks to Seokchan Yoon for the report.

CVE-2026-25674: Potential incorrect permissions on newly created file system objects

Django's file-system storage and file-based cache backends used the process umask to control permissions when creating directories. In multi-threaded environments, one thread's temporary umask change can affect other threads' file and directory creation, resulting in file system objects being created with unintended permissions.

Django now applies the requested permissions via os.chmod() after os.mkdir(), removing the dependency on the process-wide umask.

This issue has severity "low" according to the Django Security Policy.

Thanks to Tarek Nakkouch for the report.

Affected supported versions

  • Django main
  • Django 6.0
  • Django 5.2
  • Django 4.2

Resolution

Patches to resolve the issue have been applied to Django's main, 6.0, 5.2, and 4.2 branches. The patches may be obtained from the following changesets.

CVE-2026-25673: Potential denial-of-service vulnerability in URLField via Unicode normalization on Windows

CVE-2026-25674: Potential incorrect permissions on newly created file system objects

The following releases have been issued

The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance, nor via the Django Forum. Please see our security policies for further information.

March 03, 2026 02:00 PM UTC


Real Python

What Does Python's __init__.py Do?

Python’s special __init__.py file marks a directory as a regular Python package and allows you to import its modules. This file runs automatically the first time you import its containing package. You can use it to initialize package-level variables, define functions or classes, and structure the package’s namespace clearly for users.

By the end of this video course, you’ll understand that:

Understanding how to effectively use __init__.py helps you structure your Python packages in a clear, maintainable way, improving usability and namespace management.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 03, 2026 02:00 PM UTC

Quiz: Duck Typing in Python: Writing Flexible and Decoupled Code

In this quiz, you’ll test your understanding of Duck Typing in Python: Writing Flexible and Decoupled Code.

By working through this quiz, you’ll revisit what duck typing is and its pros and cons, how Python uses behavior-based interfaces, how protocols and special methods support it, and what alternatives you can use in Python.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 03, 2026 12:00 PM UTC


PyBites

Why Building a Production RAG Pipeline is Easier Than You Think

Adding AI to legacy code doesn’t have to be a challenge.

Many devs are hearing this right now: “We need to add AI to the app.”

And for many of them, panic ensues.

The assumption is that you have to rip your existing architecture down to its foundation. You start having nightmares about standing up complex microservices, massive AWS bills, and spending six months learning the intricate math behind vector embeddings.

It feels like a monumental risk to your stable, production-ready codebase, right?

Here’s the current reality though: adding AI to an existing application doesn’t actually require a massive rewrite.

If you have solid software design fundamentals, integrating a Retrieval-Augmented Generation (RAG) pipeline is entirely within your reach.

Here’s how you do it without breaking everything you’ve already built.

Get the Python Stack to do the Heavy Lifting

You don’t need to build your AI pipeline from scratch. The Python ecosystem has matured to the point where the hardest parts of a RAG pipeline are already solved for you.

It’s not a huge technical challenge in Python. It is just an orchestration of existing, powerful tools.

Stop Coding and Start Orchestrating

When a developer builds a RAG pipeline and the AI starts hallucinating, their first instinct is to dive into the code. They try to fix the API calls or mess with the vector search logic.

Take a step back. The code usually isn’t the problem.

The system prompt is the conductor of your entire RAG pipeline. It dictates how the LLM interacts with your vector database. If you’re getting bad results, you don’t need to rewrite your Python logic – you need to refine your prompt through trial and error to strict, data-grounded constraints.

Beat Infrastructure Constraints by Offloading

What if your app is hosted on something lightweight, like Heroku, with strict size and memory limits? You might think you need to containerise everything and migrate to a heavier cloud setup.

Nope! You just need to separate your concerns.

Indexing documents and generating embeddings is heavy work. Querying is light. Offload the heavy lifting (like storing and searching vectors) entirely to your vector database service (like Weaviate). This keeps your core app lightweight, so it only acts as the middleman routing the query.

We Broke Down Exactly How This Works With Tim Gallati

We explored the reality of this architecture with Tim Gallati and Pybites AI coach Juanjo on the podcast. Tim had an existing app, Quiet Links, running on Heroku.

In just six weeks with us, he integrated a massive, production-ready RAG pipeline into it, without breaking his existing user experience.

If you want to hear the full breakdown of how they architected this integration, listen to the episode using the player above, or at the following links:

March 03, 2026 09:06 AM UTC

March 02, 2026


Python Morsels

When are classes used in Python?

While you don't often need to make your own classes in Python, they can sometimes make your code reusable and easier to read.

Table of contents

  1. Using classes because a framework requires it
  2. Passing the same data into multiple functions
  3. Creating a custom class
  4. Using a class to convey the purpose of data
  5. Using classes to improve readability

Using classes because a framework requires it

Unlike many programming languages, you can accomplish quite a bit in Python without ever making a class.

So when are classes typically used?

The most common time to use a class in Python is when using a library or framework that requires writing a class.

For example, users of the Django web framework need to make a class to allow Django to manage their database tables:

from django.db import models


class Product(models.Model):
    name = models.CharField(max_length=300)
    price = models.DecimalField(max_digits=10, decimal_places=2)
    description = models.TextField()

    def __str__(self):
        return self.name

With Django, each Model class is used for managing one table in the database. Each instance of every model represents one row in each database table:

>>> from products.models import Product
>>> Product.objects.get(id=1)
<Product: rubber duck>

Many Python programmers create their first class because a library requires them to make one.

Passing the same data into multiple functions

But sometimes you may choose 


Read the full article: https://www.pythonmorsels.com/when-are-classes-used/

March 02, 2026 11:30 PM UTC


Brett Cannon

State of WASI support for CPython: March 2026

It&aposs been a while since I posted about WASI support in CPython! 😅 Up until now, most of the work I have been doing around WASI has been making its maintenance easier for me and other core developers. For instance, the cpython-devcontainer repo now provides a WASI dev container so people don&apost have to install the WASI SDK to be productive (e.g. there&aposs a WASI codespace now so you can work on WASI entirely from your browser without installing anything). All this work around making development easier not only led to having WASI instructions in the devguide and the creation of a CLI app, but also a large expansion of how to use containers to do CPython development.

But the main reason I&aposm blogging today is that PEP 816 was accepted! That PEP defines how WASI compatibility will be handled starting with Python 3.15. The key point is that once the first beta for a Python version is reached, the WASI and WASI SDK versions that will be supported for the life of that Python version will be locked down. That gives the community a target to build packages for since things built with the WASI SDK are not forwards- or backwards-compatible for linking purposes due to wasi-libc not having any compatibility guarantees.

With that PEP out of the way, the next big items on my WASI todo list are (in rough order):

  1. Implement a subcommand to bundle up a WASI build for distribution
  2. Write a PEP to define a platform tag for wheels
  3. Implement a subcommand to build dependencies for CPython (e.g. zlib)
  4. Turn on socket support (which requires WASI 0.3 and threading support to be released as I&aposm skipping over WASI 0.2)

March 02, 2026 07:28 PM UTC


Rodrigo GirĂŁo SerrĂŁo

TIL #139 – Multiline input in the REPL

Today I learned how to do multiline input in the REPL using an uncommon combination of arguments for the built-in open.

A while ago I learned I could use open(0) to open standard input. This unlocks a neat trick that allows you to do multiline input in the REPL:

>>> msg = open(0).read()
Hello,
world!
^D
>>> msg
'Hello,\nworld!\n'

The cryptic ^D is Ctrl+D, which means EOF on Unix systems. If you're on Windows, use Ctrl+Z.

The problem is that if you try to use open(0).read() again to read more multiline input, you get an exception:

OSError: [Errno 9] Bad file descriptor

That's because, when you finished reading the first time around, Python closed the file descriptor 0, so you can no longer use it.

The fix is to set closefd=False when you use the built-in open. With the parameter closefd set to False, the underlying file descriptor isn't closed and you can reuse it:

>>> msg1 = open(0, closefd=False).read()
Hello,
world!
^D
>>> msg1
'Hello,\nworld!\n'

>>> msg2 = open(0, closefd=False).read()
Goodbye,
world!
^D
>>> msg2
'Goodbye,\nworld!\n'

By using open(0, closefd=False), you can read multiline input in the REPL repeatedly.

March 02, 2026 02:15 PM UTC


Real Python

Automate Python Data Analysis With YData Profiling

The YData Profiling package generates an exploratory data analysis (EDA) report with a few lines of code. The report provides dataset and column-level analysis, including plots and summary statistics to help you quickly understand your dataset. These reports can be exported to HTML or JSON so you can share them with other stakeholders.

By the end of this tutorial, you’ll understand that:

  • YData Profiling generates interactive reports containing EDA results, including summary statistics, visualizations, correlation matrices, and data quality warnings from DataFrames.
  • ProfileReport creates a profile you can save with .to_file() for HTML or JSON export, or display inline with .to_notebook_iframe().
  • Setting tsmode=True and specifying a date column with sortby enables time series analysis, including stationarity tests and seasonality detection.
  • The .compare() method generates side-by-side reports highlighting distribution shifts and statistical differences between datasets.

To get the most out of this tutorial, you’ll benefit from having knowledge of pandas.

Note: The examples in this tutorial were tested using Python 3.13. Additionally, you may need to install setuptools<81 for backward compatibility.

You can install this package using pip:

Shell
$ python -m pip install ydata-profiling

Once installed, you’re ready to transform any pandas DataFrame into an interactive report. To follow along, download the example dataset you’ll work with by clicking the link below:

Get Your Code: Click here to download the free sample code and start automating Python data analysis with YData Profiling.

The following example generates a profiling report from the 2024 flight delay dataset and saves it to disk:

Python flight_report.py
import pandas as pd
from ydata_profiling import ProfileReport

df = pd.read_csv("flight_data_2024_sample.csv")

profile = ProfileReport(df)
profile.to_file("flight_report.html")

This code generates an HTML file containing interactive visualizations, statistical summaries, and data quality warnings:

Dataset overview displaying statistics and variable types. Statistics include 35 variables, 10,000 observations, and 3.2% missing cells. Variable types: 5 categorical, 23 numeric, 1 DateTime, 6 text.

You can open the file in any browser to explore your data’s characteristics without writing additional analysis code.

There are a number of tools available for high-level dataset exploration, but not all are built for the same purpose. The following table highlights a few common options and when each one is a good fit:

Use case Pick Best for
You want to quickly generate an exploratory report ydata-profiling Generating exploratory data analysis reports with visualizations
You want an overview of a large dataset skimpy or df.describe() Providing fast, lightweight summaries in the console
You want to enforce data quality pandera Validating schemas and catching errors in data pipelines

Overall, YData Profiling is best used as an exploratory report creation tool. If you’re looking to generate an overview for a large dataset, using SkimPy or a built-in DataFrame library method may be more efficient. Other tools, like Pandera, are more appropriate for data validation.

If YData Profiling looks like the right choice for your use case, then keep reading to learn about its most important features.

Take the Quiz: Test your knowledge with our interactive “Automate Python Data Analysis With YData Profiling” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Automate Python Data Analysis With YData Profiling

Test your knowledge of YData Profiling, including report creation, customization, performance optimization, time series analysis, and comparisons.

Building a Report With YData Profiling

Read the full article at https://realpython.com/ydata-profiling-eda/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 02, 2026 02:00 PM UTC


Rodrigo GirĂŁo SerrĂŁo

Remove extra spaces

Learn how to remove extra spaces from a string using regex, string splitting, a fixed point, and itertools.groupby.

In this article you'll learn about three different ways in which you can remove extra spaces from the middle of a string. That is, you'll learn how to go from a string like

string = "This is  a   perfectly    normal     sentence."

to a string like

string = "This is a perfectly normal sentence."

The best solution to remove extra spaces from a string

The best solution for this task, which is both readable and performant, uses the regex module re:

import re

def remove_extra_spaces(string):
    return re.sub(" {2,}", " ", string)

The function sub can be used to substitute a pattern for a replacement you specify. The pattern " {2,}" finds runs of 2 or more consecutive spaces and replaces them with a single space.

String splitting

Using the string method split can also be a good approach:

def remove_extra_spaces(string):
    return " ".join(string.split(" "))

If you're using string splitting, you'll want to provide the space " " as an argument. If you call split with no arguments, you'll be splitting on all whitespace, which is not what you want if you have newlines and other whitespace characters you should preserve.

This solution is great, except it doesn't work:

print(remove_extra_spaces(string))
# 'This is  a   perfectly    normal     sentence.'

The problem is that splitting on the space will produce a list with empty strings:

print(string.split(" "))
# ['This', 'is', '', 'a', '', '', 'perfectly', '', '', '', 'normal', '', '', '', '', 'sentence.']

These empty strings will be joined back together and you'll end up with the same string you started with. For this to work, you'll have to filter the empty strings first:

def remove_extra_spaces(string):
    return " ".join(filter(None, string.split(" ")))

Using filter(None, ...) filters out the Falsy strings, so that the final joining operation only joins the strings that matter.

This solution has a problem, though, in that it will completely remove any leading or trailing whitespace, which may or may not be a problem.

The two solutions presented so far — using regular expressions and string splitting — are pretty reasonable. But they're also boring. You'll now learn about two other solutions.

Replacing spaces until you hit a fixed point

You can think about the task of removing extra spaces as the task of replacing extra spaces by the empty string. And if you think about doing string replacements, you should think about the string method replace.

You can't do something like string.replace(" ", ""), otherwise you'd remove all spaces, so you have to be a bit more careful:

def remove_extra_spaces(string):
    while True:
        new_string = string.replace("  ", " ")
        if new_string == string:
            break
        string = new_string
    return string

You can replace two consecutive spaces by a single space, and you repeat this operation until nothing changes in your string.

The idea of running a function until its output doesn't change is common enough in maths that they call...

March 02, 2026 12:39 PM UTC


Real Python

Quiz: Automate Python Data Analysis With YData Profiling

In this quiz, you’ll test your understanding of Automate Python Data Analysis With YData Profiling.

By working through this quiz, you’ll revisit how to generate and display profile reports in a notebook, export reports to files, add column descriptions, and speed up profiling.

This quiz focuses on practical YData Profiling tasks such as rendering reports, comparing datasets, and preparing time series data. If you want a deeper walkthrough, review the tutorial linked above.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 02, 2026 12:00 PM UTC

Quiz: The pandas DataFrame: Make Working With Data Delightful

In this quiz, you’ll test your understanding of the pandas DataFrame.

By working through this quiz, you’ll review how to create pandas DataFrames, access and modify columns, insert and sort data, extract values as NumPy arrays, and how pandas handles missing data.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 02, 2026 12:00 PM UTC


Python Bytes

#471 The ORM pattern of 2026?

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://mkennedy.codes/posts/raw-dc-the-orm-pattern-of-2026/?featured_on=pythonbytes">Raw+DC: The ORM pattern of 2026</a>?</strong></li> <li><strong><a href="https://github.com/okken/pytest-check/releases?featured_on=pythonbytes">pytest-check releases</a></strong></li> <li><strong><a href="https://dcw.ritviknag.com/en/latest/#">Dataclass Wizard</a></strong></li> <li><strong><a href="https://github.com/adamghill/sqliteo?featured_on=pythonbytes">SQLiteo</a> - “native macOS SQLite browser built for normal people”</strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=tZyf7KtTQVU' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="471">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a> <strong>Connect with the hosts</strong></li> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky) Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</li> </ul> <p><strong>Michael #1: <a href="https://mkennedy.codes/posts/raw-dc-the-orm-pattern-of-2026/?featured_on=pythonbytes">Raw+DC: The ORM pattern of 2026</a>?</strong></p> <ul> <li>ORMs/ODMs provide great support and abstractions for developers</li> <li>They are not the <em>native</em> language of agentic AI</li> <li>Raw queries are trained 100x+ more than standard ORMs</li> <li>Using raw queries at the data access optimizes for AI coding</li> <li>Returning some sort of object mapped to the data optimizes for type safety and devs</li> </ul> <p><strong>Brian #2: <a href="https://github.com/okken/pytest-check/releases?featured_on=pythonbytes">pytest-check releases</a></strong></p> <ul> <li>3 merged pull requests</li> <li>8 closed issues</li> <li>at one point got to 0 PR’s and 1 enhancement request</li> <li>Now back to 2 issues and 1 PR, but activity means it’s still alive and being used. so cool</li> <li>Check out <a href="https://github.com/okken/pytest-check/blob/main/changelog.md?featured_on=pythonbytes">changelog</a> for all mods</li> <li>A lot of changes around supporting mypy <ul> <li>I’ve decided to NOT have the examples be fully <code>--strict</code> as I find it reduces readability <ul> <li>See <code>tox.ini</code> for explanation</li> </ul></li> <li>But src is <code>--strict</code> clean now, so user tests can be <code>--strict</code> clean.</li> </ul></li> </ul> <p><strong>Michael #3: <a href="https://dcw.ritviknag.com/en/latest/#">Dataclass Wizard</a></strong></p> <ul> <li><strong>Simple, elegant wizarding tools for Python’s</strong> <code>dataclasses</code>.</li> <li>Features <ul> <li>🚀 Fast — code-generated loaders and dumpers</li> <li>đŸȘ¶ Lightweight — pure Python, minimal dependencies</li> <li>🧠 Typed — powered by Python type hints</li> <li>🧙 Flexible — JSON, YAML, TOML, and environment variables</li> <li>đŸ§Ș Reliable — battle-tested with extensive test coverage</li> </ul></li> <li><a href="https://dcw.ritviknag.com/en/latest/#no-inheritance-needed">No Inheritance Needed</a></li> </ul> <p><strong>Brian #4: <a href="https://github.com/adamghill/sqliteo?featured_on=pythonbytes">SQLiteo</a> - “native macOS SQLite browser built for normal people”</strong></p> <ul> <li>Adam Hill</li> <li>This is a fun tool, built by someone I trust.</li> <li>That trust part is something I’m thinking about a lot in these days of dev+agent built tools</li> <li>Some notes on my thoughts when evaluating <ul> <li>I know mac rules around installing .dmg files not from the apple store are picky. <ul> <li>And I like that</li> </ul></li> <li>But I’m ok with the override when something comes from a dev I trust</li> <li>The contributors are all Adam <ul> <li>I’m still not sure how I feel about letting agents do commits in repos</li> </ul></li> <li>There’s “AGENTS” folder and markdown files in the project for agents, so Ad</li> </ul></li> </ul> <p><strong>Extras</strong></p> <p>Michael:</p> <ul> <li><a href="https://lp.jetbrains.com/python-unplugged/?featured_on=pythonbytes">PyTV Python Unplugged This Week</a></li> <li><a href="https://www.techbuzz.ai/articles/ibm-crashes-11-as-anthropic-threatens-cobol-empire?featured_on=pythonbytes">IBM Crashes 11% in 4 Hours</a> - $24 Billion Wiped Out After Anthropic's Claude Code Threatens the Entire COBOL Consulting Industry</li> <li>Loving my <a href="https://www.amazon.com/dp/B0FJYNVR3R?ref_=ppx_hzsearch_conn_dt_b_fed_asin_title_1&featured_on=pythonbytes">40” ultrawide monitor</a> more every day</li> <li><a href="https://updatest.app?featured_on=pythonbytes">Updatest</a> for updating all the mac things</li> <li><a href="https://www.reddit.com/r/macapps/comments/1qwkq38/os_thaw_a_fork_of_ice_menu_bar_manager_for_macos/?featured_on=pythonbytes">Ice has Thawed out</a> (mac menubar app)</li> </ul> <p><strong>Joke: <a href="https://x.com/pr0grammerhum0r/status/2018852032304566331?s=12&featured_on=pythonbytes">House is read-only</a>!</strong></p>

March 02, 2026 08:00 AM UTC

March 01, 2026


Tryton News

Tryton News March 2026

In the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues - building on the changes from our last release. We also added some new features which we would like to introduce to you in this newsletter.

For an in depth overview of the Tryton issues please take a look at our issue tracker or see the issues and merge requests filtered by label.

Changes for the User

Sales, Purchases and Projects

Now we add the web shop URL to sales.

We now add a menu entry for party identifier.

Accounting, Invoicing and Payments

Now we can search for shipments on the invoice line.

In UBL we now set BillingReference to Invoice and Credit Note.

We now improve the layout of the invoice credit form.

Now we enforce the Peppol rule “[BR-27]-The Item net price (BT-146) shall NOT be negative”. So we make sure that the unit price on an invoice line is not negative.

We now add an Update Status button to the Peppol document.

Stock, Production and Shipments

Now we can charge duties for UPS for buyer or seller on import and export defined in the incoterms.

User Interface

Now we allow to re-order tabs in Sao.

In the favourites menu we now display a message how to use it, instead of showing an empty menu.

We also improve the blank state of notifications by showing a message.

Now we add a button in Sao for closing the search filter.

System Data and Configuration

Now we update the required version of python-stdnum to version 2.22 and introduced new party identifier.

More (click for more details)

New Releases

We released bug fixes for the currently maintained long term support series
7.0 and 6.0, and for the penultimate series 7.8 and 7.6.

Security

Please update your systems to take care of a security related bug we found last month.

Changes for the System Administrator

Now we also display the create date and create time in the error list.

We now add basic authentication for user applications. Because in some cases the consumer of the user application may not be able to use the bearer authentication.

Now we allow to activate the development mode with WSGI applications by setting the environment variable TRYTOND_DEV.

Writing compatible HTML for email can be very difficult. MJML provides a syntax to ease the creation of such emails. So now we support the MJML email format in Tryton.

Changes for Implementers and Developers

We now add a timestamp field on ModelStorage for last modified.

Now we introduce a new type of field for SQL expressions: Field.sql_column(tables, Model)

We now allow UserError and UserWarning exceptions to be raised on evaluating button inputs.

Now we replace the extension separator by an underscore in report names used as temporary files.

We now do no longer check for missing parent depends when the One2Many is readonly.

Now we preserve the line numbers when converting doctest files to python files.

Authors: @dave @pokoli @udono

1 post - 1 participant

Read full topic

March 01, 2026 07:00 AM UTC

February 28, 2026


Talk Python to Me

#538: Python in Digital Humanities

Digital humanities sounds niche, until you realize it can mean a searchable archive of U.S. amendment proposals, Irish folklore, or pigment science in ancient art. Today I’m talking with David Flood from Harvard’s DARTH team about an unglamorous problem: What happens when the grant ends but the website can’t. His answer, static sites, client-side search, and sneaky Python. Let’s dive in.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code talkpython26</a><br> <a href='https://talkpython.fm/commandbookapp'>Command Book</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>David Flood</strong>: <a href="https://www.davidaflood.com?featured_on=talkpython" target="_blank" >davidaflood.com</a><br/> <br/> <strong>DARTH</strong>: <a href="https://digitalhumanities.fas.harvard.edu?featured_on=talkpython" target="_blank" >digitalhumanities.fas.harvard.edu</a><br/> <strong>Amendments Project</strong>: <a href="https://digitalhumanities.fas.harvard.edu/projects/amend/?featured_on=talkpython" target="_blank" >digitalhumanities.fas.harvard.edu</a><br/> <strong>Fionn Folklore Database</strong>: <a href="https://fionnfolklore.org/en?featured_on=talkpython" target="_blank" >fionnfolklore.org</a><br/> <strong>Mapping Color in History</strong>: <a href="https://iiif.harvard.edu/projects/mapping-color-in-history/?featured_on=talkpython" target="_blank" >iiif.harvard.edu</a><br/> <strong>Apatosaurus</strong>: <a href="https://apatosaurus.io/?featured_on=talkpython" target="_blank" >apatosaurus.io</a><br/> <strong>Criticus</strong>: <a href="https://github.com/d-flood/criticus?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>github.com/palewire/django-bakery</strong>: <a href="https://github.com/palewire/django-bakery?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>sigsim.acm.org/conf/pads/2026/blog/artifact-evaluation</strong>: <a href="https://sigsim.acm.org/conf/pads/2026/blog/artifact-evaluation/?featured_on=talkpython" target="_blank" >sigsim.acm.org</a><br/> <strong>Hugo</strong>: <a href="https://gohugo.io?featured_on=talkpython" target="_blank" >gohugo.io</a><br/> <strong>Water Stories</strong>: <a href="https://waterstories.fas.harvard.edu/?featured_on=talkpython" target="_blank" >waterstories.fas.harvard.edu</a><br/> <strong>Tsumeb Mine Notebook</strong>: <a href="https://tmn.fas.harvard.edu/?featured_on=talkpython" target="_blank" >tmn.fas.harvard.edu</a><br/> <strong>Dharma and Punya</strong>: <a href="https://dharmapunya2019.org/?featured_on=talkpython" target="_blank" >dharmapunya2019.org</a><br/> <strong>Pagefind library</strong>: <a href="https://pagefind.app?featured_on=talkpython" target="_blank" >pagefind.app</a><br/> <strong>django_webassembly</strong>: <a href="https://github.com/m-butterfield/django_webassembly?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Astro Static Site Generator</strong>: <a href="https://astro.build?featured_on=talkpython" target="_blank" >astro.build</a><br/> <strong>PageFind Python Lib</strong>: <a href="https://pypi.org/project/pagefind/?featured_on=talkpython" target="_blank" >pypi.org</a><br/> <strong>Frozen-Flask</strong>: <a href="https://frozen-flask.readthedocs.io/en/latest/?featured_on=talkpython" target="_blank" >frozen-flask.readthedocs.io</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=ZaI2AxRq_OA" target="_blank" >youtube.com</a><br/> <strong>Episode #538 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/538/python-in-digital-humanities#takeaways-anchor" target="_blank" >talkpython.fm/538</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/538/python-in-digital-humanities" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>đŸ„ Served in a Flask 🎾</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

February 28, 2026 09:28 PM UTC