Planet Python
Last update: March 05, 2026 07:43 PM UTC
March 05, 2026
Real Python
Quiz: How to Use the OpenRouter API to Access Multiple AI Models via Python
In this quiz, you’ll test your understanding of How to Use the OpenRouter API to Access Multiple AI Models via Python.
By completing this quiz, you’ll review how OpenRouter provides a unified routing layer, how to call multiple providers from a single Python script, how to switch models without changing your code, and how to compare outputs.
It also reinforces practical skills for making API requests in Python, handling authentication, and processing responses. For deeper guidance, review the tutorial linked above.
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
scikit-learn
Update on array API adoption in scikit-learn
Note: this blog post is a cross-post of a Quansight Labs blog post.
The Consortium for Python Data API Standards developed the Python array API standard to define a consistent interface for array libraries, specifing core operations, data types, and behaviours. This enables âarray-consumingâ libraries (such as scikit-learn) to write array-agnostic code that can be run on any array API compliant backend. Adopting array API support in scikit-learn means that users can pass arrays from any array API compliant library to functions that have been converted to be array-agnostic. This is useful because it allows users to take advantage of array library features, such as hardware acceleration, most notably via GPUs.
Indeed, GPU support in scikit-learn has been of interest for a long time - 11 years ago, we added an entry to our FAQ page explaining that we had no plans to add GPU support in the near future due to the software dependencies and platform specific issues it would introduce. By relying on the array API standard, however, these concerns can now be avoided.
In this blog post, I will provide an update to the array API adoption work in scikit-learn, since itâs initial introduction in version 1.3 two years ago. Thomas Fanâs blog post provides details on the status when array API support was initially added.
Current status
Since the introduction of array API support in version 1.3 of scikit-learn, several key developments have followed.
Vendoring array-api-compat and array-api-extra
Scikit-learn now vendors both
array-api-compat and
array-api-extra.
array-api-compat is a wrapper around common array libraries (e.g., PyTorch,
CuPy, JAX) that bridges gaps to ensure compatibility with the standard. It
enables adoption of backwards incompatible changes while still allowing array
libraries time to adopt the standard slowly. array-api-extra provides array
functions not included in the standard but deemed useful for array-consuming
libraries.
We chose to vendor these now much more mature libraries in order to avoid the complexity of conditionally handling optional dependencies throughout the codebase. This approach also follows precedent, as SciPy also vendors these packages.
Array libraries supported
Scikit-learn currently supports CuPy ndarrays, PyTorch tensors (testing against all devices: âcpuâ, âcudaâ, âmpsâ and âxpuâ) and NumPy arrays. JAX support is also on the horizon. The main focus of this work is addressing in-place mutations in the codebase. Follow PR #29647 for updates.
Beyond these libraries, scikit-learn also tests against array-api-strict, a
reference implementation that strictly adheres to the array API specification.
The purpose of array-api-strict is to help automate compliance checks for
consuming libraries and to enable development and testing of array
API functionality without the need for GPU or other specialized hardware.
Array libraries that conform to the standard and pass the array-api-tests suite
should be accepted by scikit-learn and SciPy, without any additional modifications
from maintainers.
Estimators and metrics with array API support
The full list of metrics and estimators that now support array API can be
found in our
Array API support
documentation page. The majority of high impact metrics have now been
converted to be array API compatible. Many transformers are also now
supported, notably LabelBinarizer which is widely used internally and
simplifies other conversions.
Conversion of estimators is much more complicated as it often involves
benchmarking different variations of code or consensus gathering on
implementation choices. It generally requires many months of work by several
maintainers. Nonetheless, support for LogisticRegression, GaussianNB,
GaussianMixture, Ridge (and family: RidgeCV, RidgeClassifier,
RidgeClassifierCV), Nystroem and PCA has been added. Work on
GaussianProcessRegressor is also underway (follow at
PR #33096).
Handling mixed array namespaces and devices
scikit-learn takes a unique approach among âarray-consumingâ libraries by supporting mixed array namespace and device inputs. This design choice enables the framework to handle the practical complexities of end-to-end machine learning pipelines.
String-valued class labels are common in classification tasks and enable users to work with interpretable categories rather than integer codes. NumPy is currently the only array library with string array support, meaning that any workflow involving both GPU-accelerated computation and string labels necessarily involves mixed array type inputs.
Mixed array input support also enables flexible pipeline workflows. Pipelines
provide significant value by chaining preprocessing steps and estimators into
reusable workflows that prevent data leakage and ensure consistent
preprocessing. However, they have an intentional design limitation: pipeline
steps can transform feature arrays (X) but cannot modify target arrays
(y). Allowing mixed array inputs means a pipeline can include a
FunctionTransformer step that moves feature data from CPU to GPU to leverage
hardware acceleration, while allowing the target array, which cannot be
modified, to remain on CPU.
For example, mixed array inputs enable a pipeline where string classification
features are encoded on CPU (as only NumPy supports string arrays), converted
to torch CUDA tensors, then passed to the array API-compatible
RidgeClassifier for GPU-accelerated computation:
from functools import partial
from sklearn.linear_model import RidgeClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer, TargetEncoder
pipeline = make_pipeline(
# Encode string categories with average target values
TargetEncoder(),
# Convert feature array `X` to Torch CUDA device
FunctionTransformer(partial(torch.asarray, dtype="float32", device="cuda"))
RidgeClassifier(solver="svd"),
)
Work on adding mixed array type inputs for metrics and estimators is underway and expected to progress quickly. This work includes developing a robust testing framework, including for pipelines using mixed array types (follow PR #32755 for details).
Finally, we have also revived our work to support the ability to fit and predict on different namespaces/devices. This allows users to train models on GPU hardware but deploy predictions on CPU hardware, optimizing costs and accommodating different resource availability between training and production environments. Follow PR #33076 for details.
Challenges
The challenges of array API adoption remain largely unchanged from when this work began. These are also common to other array-consuming libraries, with a notable addition: the need to handle array movement between namespaces and devices to support mixed array type inputs.
Array API Standard is a subset of NumPyâs API
The array API standard only includes widely-used functions implemented across most array libraries, meaning many NumPy functions are absent. When such a function is encountered while adding array API support, we have the following options:
- add the function to
array-api-extra- this allows other array-consuming libraries to benefit and allows sharing of maintenance burden, but is only relevant for more widely used functions - add our own implementation in scikit-learn - these functions live in
sklearn/utils/_array_api.py - check if SciPy implements an array API compatible version of the function
The quantile function illustrates this decision-making process. quantile
is not included in the standard as it is not widely used (outside of
scikit-learn) and while it is implemented in most array libraries, the
set of quantile methods supported and their APIs vary. Currently, scikit-learn maintains its
own array API compatible version that supports both weights and NaNs, but due
to the maintenance burden we decided to investigate alternatives. SciPy has an
array API compatible implementation, but it did not support weights. We thus
investigated adding quantile to array-api-extra; however, during this
effort, SciPy decided to add weight support. Thus, we ultimately decided to
transition to the SciPy implementation once our minimum SciPy version allows.
Compiled code
Many performance-critical parts of scikit-learn are written using compiled code extensions in Cython, C or C++. These directly access the underlying memory buffers of NumPy arrays and are thus restricted to CPU.
Metrics and estimators, with compiled code, handle this in one of two ways:
convert arrays to NumPy first or maintain two parallel branches of code, one
for NumPy (compiled) and one for other array types (array API compatible).
When performance is less critical or array API conversion provides no gains
(e.g., confusion_matrix), we convert to NumPy. When performance gains are
significant, we accept the maintenance burden of dual code paths. This was the case for
LogisticRegression and the extensive process required for making such implementation
decisions can be seen in the
PR #32644.
Unspecified behaviour in the standard
The array API standard intentionally leaves some function behaviors
unspecified, permitting implementation differences across array libraries. For
example, the order of unique elements is not specified for the unique_*
functions and as of NumPy version 2.3, some unique_* functions no longer
return sorted values. This will require code amendments in cases where sorted
output was relied upon.
Similarly, NaN handing is also unspecified for sort; however, in this case, all
array libraries currently supported by scikit-learn follow NumPyâs NaN
semantics, placing NaNs at the end. This consistency eliminates the need for
special handling code, though comprehensive testing remains essential when
adding support for new array libraries.
Device transfer
Mixed array namespace and device inputs necessitates conversion of arrays between different namespaces and devices. This presented a number of considerations and challenges.
The array API standard adopted DLPack as the recommended data interchange protocol. This protocol is widely implemented in array libraries and offers an efficient, C ABI compatible protocol for array conversion. While this provided us with an easy way to implement these transfers, there were limitations. Cross-device transfer capability was only introduced in DLPack v1, released in September 2024. This meant that only the latest PyTorch and CuPy versions have support for DLPack v1. Moreover, not all array libraries have adopted support yet. We therefore implemented a âmanualâ fallback; however, this requires conversion via NumPy when the transfer involves two non-NumPy arrays. Additionally, there are no DLPack tests in array-api-tests, a testing suite to verify standard compliance, leaving DLPack implementation bugs easier to overlook. Despite these challenges, scikit-learn will benefit from future improvements, such as addition of a C-level API for DLPack exchange that bypasses Python function calls, offering significant benefit for GPU applications.
Beyond the technical considerations, there were also user interface considerations. How should we inform users that these conversions, which incur memory and performance cost, are occurring? We decided against warnings, which risk being ignored or becoming a nuisance, and to instead clearly document this behaviour. Additionally, different devices have different data type limitations; for example, Apple MPS only supports float32. How best to handle these differences when performing conversions while ensuring users are informed of precision impacts is an ongoing consideration.
A quick benchmark
Array API support for Ridge regression was added in version 1.5, enabling
GPU-accelerated linear models in scikit-learn. Combined with support of
several transformers, this allows for complete preprocessing and estimation
pipelines on GPU.
The following benchmark shows the use of the MaxAbsScaler transformed
followed by Ridge regression using randomly generated data with 500,000
samples and 300 features. The benchmarks were run on AMD Ryzen Threadripper
2970WX CPU, NVIDIA Quadro RTX 8000 GPU and Apple M4 GPU (Metal 3).
The figure below shows the performance speed up on CuPy, Torch CPU and Torch GPU relative to NumPy.

Performance speedup relative to NumPy across different backends.
The observed speedups are representative of performance gains achievable with sufficiently large datasets on datacenter-grade GPUs for linear algebra-intensive workloads. Mobile GPUs, such as those in laptops, would typically yield more modest improvements.
Note that scikit-learnâs Ridge regressor currently only supports âsvdâ
solver. We selected this solver for initial implementation as it exclusively
uses standard-compliant functions available across all backends and is the
most stable solver. Support for the âcholeskyâ solver is also underway (see
details in PR #29318).
Looking forward
As of version 1.8, array API support is still in experimental mode and thus not enabled by default. However, we welcome early adopters and interested users to try it and report any issues. See our documentation for details on enabling array API support.
Before removing experimental status, we would like to:
- develop a system for automatically documenting functions and classes that support array API, potentially with the ability to add relevant details
- mixed array type input support
- support fit and predict on different hardware by allowing conversion of fitted estimators between namespaces/devices using utility functions
- improved testing, in particular for the new mixed array type functionalities
- improved documentation, including adding an example to our gallery
- decide on the minimal dependency versions required
- get real world user feedback
Alongside these infrastructure and framework improvements, we look forward to adding support for more estimators. These improvements will deliver production-ready GPU support and flexible deployment options to scikit-learn users. We welcome community involvement through testing and feedback throughout this development phase.
Acknowledgements
Work on array API in scikit-learn has been a combined effort from many contributors. This work was partly funded by CZI and NASA Roses.
I would like to thank Olivier Grisel, Tim Head and Evgeni Burovski for helping me with my array API questions.
Armin Ronacher
AI And The Ship of Theseus
Because code gets cheaper and cheaper to write, this includes re-implementations. I mentioned recently that I had an AI port one of my libraries to another language and it ended up choosing a different design for that implementation. In many ways, the functionality was the same, but the path it took to get there was different. The way that port worked was by going via the test suite.
Something related, but different, happened with chardet. The current maintainer reimplemented it from scratch by only pointing it to the API and the test suite. The motivation: enabling relicensing from LGPL to MIT. I personally have a horse in the race here because I too wanted chardet to be under a non-GPL license for many years. So consider me a very biased person in that regard.
Unsurprisingly, that new implementation caused a stir. In particular, Mark Pilgrim, the original author of the library, objects to the new implementation and considers it a derived work. The new maintainer, who has maintained it for the last 12 years, considers it a new work and instructs his coding agent to do precisely that. According to author, validating with JPlag, the new implementation is distinct. If you actually consider how it works, that’s not too surprising. It’s significantly faster than the original implementation, supports multiple cores and uses a fundamentally different design.
What I think is more interesting about this question is the consequences of where we are. Copyleft code like the GPL heavily depends on copyrights and friction to enforce it. But because it’s fundamentally in the open, with or without tests, you can trivially rewrite it these days. I myself have been intending to do this for a little while now with some other GPL libraries. In particular I started a re-implementation of readline a while ago for similar reasons, because of its GPL license. There is an obvious moral question here, but that isn’t necessarily what I’m interested in. For all the GPL software that might re-emerge as MIT software, so might be proprietary abandonware.
For me personally, what is more interesting is that we might not even be able to copyright these creations at all. A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.
But this all causes some interesting new developments we are not necessarily ready for. Vercel, for instance, happily re-implemented bash with Clankers but got visibly upset when someone re-implemented Next.js in the same way.
There are huge consequences to this. When the cost of generating code goes down that much, and we can re-implement it from test suites alone, what does that mean for the future of software? Will we see a lot of software re-emerging under more permissive licenses? Will we see a lot of proprietary software re-emerging as open source? Will we see a lot of software re-emerging as proprietary?
It’s a new world and we have very little idea of how to navigate it. In the interim we will have some fights about copyrights but I have the feeling very few of those will go to court, because everyone involved will actually be somewhat scared of setting a precedent.
In the GPL case, though, I think it warms up some old fights about copyleft vs permissive licenses that we have not seen in a long time. It probably does not feel great to have one’s work rewritten with a Clanker and one’s authorship eradicated. Unlike the Ship of Theseus, though, this seems more clear-cut: if you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship. It only continues to carry the name. Which may be another argument for why authors should hold on to trademarks rather than rely on licenses and contract law.
I personally think all of this is exciting. I’m a strong supporter of putting things in the open with as little license enforcement as possible. I think society is better off when we share, and I consider the GPL to run against that spirit by restricting what can be done with it. This development plays into my worldview. I understand, though, that not everyone shares that view, and I expect more fights over the emergence of slopforks as a result. After all, it combines two very heated topics, licensing and AI, in the worst possible way.
March 04, 2026
PyCharm
Cursor is now available as an AI agent inside JetBrains IDEs through the Agent Client Protocol. Select it from the agent picker, and it has full access to your project.
If you’ve spent any time in the AI coding space, you already know Cursor. It has been one of the most requested additions to the ACP Registry.
What you get
Cursor is known for its AI-native, agentic workflows. JetBrains IDEs are valued for deep code intelligence â refactoring, debugging, code quality checks, and the tooling professionals rely on at scale. ACP brings the two together.
You can now use Cursor’s agentic capabilities directly inside your JetBrains IDE â within the workflows and features you already use.
A growing open ecosystem
Cursor joins a growing list of agents available through ACP in JetBrains IDEs. Every new addition to the ACP Registry means you have more choice â while still working inside the IDE you already rely on. You get access to frontier models from major providers, including OpenAI, Anthropic, Google, and now also Cursor.
This is part of our open ecosystem strategy. Plug in the agents you want and work in the IDE you love â without getting locked into a single solution.
Cursor is focused on building the best way to build software with AI. By integrating Cursor with JetBrains IDEs, we’re excited to provide teams with powerful agentic capabilities in the environments where they’re already working.
â Jordan Topoleski, COO at Cursor
Get started
You need version 2025.3.2 or later of your JetBrains IDE with the AI Assistant plugin enabled. From there, open the agent selector, select Install from ACP RegistryâŠ, install Cursor, and start working. You donât need a JetBrains AI subscription to use Cursor as an AI agent.
The ACP Registry keeps growing, and many agents have already joined it â with more on the way. Try it today with Cursor and experience agent-driven development inside your JetBrains IDE. For more information about the Agent Client Protocol, see our original announcement and the blog post on the ACP Agent Registry support.
Python Morsels
Invent your own comprehensions in Python
Python doesn't have tuple, frozenset, or Counter comprehensions, but you can invent your own by passing a generator expression to any iterable-accepting callable.
Table of contents
Generator expressions pair nicely with iterable-accepting callables
Generator expressions work really nicely with Python's any and all functions:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> any(n > 1 for n in numbers)
True
>>> all(n > 1 for n in numbers)
False
In fact, I rarely see any and all used without a generator expression passed to them.
Note that generator expressions are made with parentheses:
>>> (n**2 for n in numbers)
<generator object <genexpr> at 0x74c535589b60>
But when a generator expression is the sole argument passed into a function:
>>> all((n > 1 for n in numbers))
False
The double set of parentheses (one to form the generator expression and one for the function call) can be turned into just a single set of parentheses:
>>> all(n > 1 for n in numbers)
False
This special allowance was added to Python's syntax because it's very common to see generator expressions passed in as the sole argument to specific functions.
Note that passing generator expressions into iterable-accepting functions and classes makes something that looks a bit like a custom comprehension. Every iterable-accepting function/class is a comprehension-like tool waiting to happen.
Tuple comprehensions
Python does not have tuple âŠ
Read the full article: https://www.pythonmorsels.com/custom-comprehensions/
PyCharm
Cursor Joined the ACP Registry and Is Now Live in Your JetBrains IDE
Real Python
How to Use the OpenRouter API to Access Multiple AI Models via Python
One of the quickest ways to call multiple AI models from a single Python script is to use OpenRouterâs API, which acts as a unified routing layer between your code and multiple AI providers. By the end of this guide, youâll access models from several providers through one unified API, as shown in the image below:
OpenRouter Unified API Running Multiple AI Models
This convenience matters because the AI ecosystem is highly fragmented: each provider exposes its own API, authentication scheme, rate limits, and model lineup. Working with multiple providers often requires additional setup and integration effort, especially when you want to experiment with different models, compare outputs, or evaluate trade-offs for a specific task.
OpenRouter gives you access to thousands of models from leading providers such as OpenAI, Anthropic, Mistral, Google, and Meta. You switch between them without changing your application code.
Get Your Code: Click here to download the free sample code that shows you how to use the OpenRouter API to access multiple AI models via Python.
Take the Quiz: Test your knowledge with our interactive âHow to Use the OpenRouter API to Access Multiple AI Models via Pythonâ quiz. Youâll receive a score upon completion to help you track your learning progress:
Interactive Quiz
How to Use the OpenRouter API to Access Multiple AI Models via PythonTest your Python skills with OpenRouter: learn unified API access, model switching, provider routing, and fallback strategies.
Prerequisites
Before diving into OpenRouter, you should be comfortable with Python fundamentals like importing modules, working with dictionaries, handling exceptions, and using environment variables. If youâre familiar with these basics, the first step is authenticating with OpenRouterâs API.
Step 1: Connect to OpenRouterâs API
Before using OpenRouter, you need to create an account and generate an API key. Some models require prepaid credits for access, but you can start with free access to test the API and confirm that everything is working.
To generate an API key:
- Create an account at OpenRouter.ai or sign in if you already have an account.
- Select Keys from the dropdown menu and create an API key.
- Fill in the name, something like OpenRouter Testing.
- Leave the remaining defaults and click Create.
Copy the generated key and keep it secure. In a moment, youâll store it as an environment variable rather than embedding it directly in your code.
To call multiple AI models from a single Python script, youâll use OpenRouterâs API. Youâll use the requests library to make HTTP calls, which gives you full control over the API interactions without requiring a specific SDK. This approach works with any HTTP client and keeps your code simple and transparent.
First, create a new directory for your project and set up a virtual environment. This isolates your project dependencies from your system Python installation:
$ mkdir openrouter-project/
$ cd openrouter-project/
$ python -m venv venv/
Now, you can activate the virtual environment:
You should see (venv) in your terminal prompt when itâs active. Now youâre ready to install the requests package for conveniently making HTTP calls:
(venv) $ python -m pip install requests
Read the full article at https://realpython.com/openrouter-api/ »
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Glyph Lefkowitz
What Is Code Review For?
Humans Are Bad At Perceiving
Humans are not particularly good at catching bugs. For one thing, we get tired easily. There is some science on this, indicating that humans canât even maintain enough concentration to review more than about 400 lines of code at a time..
We have existing terms of art, in various fields, for the ways in which the human perceptual system fails to register stimuli. Perception fails when humans are distracted, tired, overloaded, or merely improperly engaged.
Each of these has implications for the fundamental limitations of code review as an engineering practice:
-
Inattentional Blindness: you wonât be able to reliably find bugs that youâre not looking for.
-
Repetition Blindness: you wonât be able to reliably find bugs that you are looking for, if they keep occurring.
-
Vigilance Fatigue: you wonât be able to reliably find either kind of bugs, if you have to keep being alert to the presence of bugs all the time.
-
and, of course, the distinct but related Alert Fatigue: you wonât even be able to reliably evaluate reports of possible bugs, if there are too many false positives.
Never Send A Human To Do A Machineâs Job
When you need to catch a category of error in your code reliably, you will need a deterministic tool to evaluate â and, thanks to our old friend âalert fatigueâ above â ideally, to also remedy that type of error. These tools will relieve the need for a human to make the same repetitive checks over and over. None of them are perfect, but:
- to catch logical errors, use automated tests.
- to catch formatting errors, use autoformatters.
- to catch common mistakes, use linters.
- to catch common security problems, use a security scanner.
Donât blame reviewers for missing these things.
Code review should not be how you catch bugs.
What Is Code Review For, Then?
Code review is for three things.
First, code review is for catching process failures. If a reviewer has noticed a few bugs of the same type in code review, thatâs a sign that that type of bug is probably getting through review more often than itâs getting caught. Which means itâs time to figure out a way to deploy a tool or a test into CI that will reliably prevent that class of error, without requiring reviewers to be vigilant to it any more.
Second â and this is actually its more important purpose â code review is a tool for acculturation. Even if you already have good tools, good processes, and good documentation, new members of the team wonât necessarily know about those things. Code review is an opportunity for older members of the team to introduce newer ones to existing tools, patterns, or areas of responsibility. If youâre building an observer pattern, you might not realize that the codebase youâre working in already has an existing idiom for doing that, so you wouldnât even think to search for it, but someone else who has worked more with the code might know about it and help you avoid repetition.
You will notice that I carefully avoided saying âjuniorâ or âseniorâ in that paragraph. Sometimes the newer team member is actually more senior. But also, the acculturation goes both ways. This is the third thing that code review is for: disrupting your teamâs culture and avoiding stagnation. If you have new talent, a fresh perspective can also be an extremely valuable tool for building a healthy culture. If youâre new to a team and trying to build something with an observer pattern, and this codebase has no tools for that, but your last job did, and it used one from an open source library, that is a good thing to point out in a review as well. Itâs an opportunity to spot areas for improvement to culture, as much as it is to spot areas for improvement to process.
Thus, code review should be as hierarchically flat as possible. If the goal of code review were to spot bugs, it would make sense to reserve the ability to review code to only the most senior, detail-oriented, rigorous engineers in the organization. But most teams already know that thatâs a recipe for brittleness, stagnation and bottlenecks. Thus, even though we know that not everyone on the team will be equally good at spotting bugs, it is very common in most teams to allow anyone past some fairly low minimum seniority bar to do reviews, often as low as âeveryone on the team who has finished onboardingâ.
Oops, Surprise, This Post Is Actually About LLMs Again
Sigh. Iâm as disappointed as you are, but there are no two ways about it: LLM code generators are everywhere now, and we need to talk about how to deal with them. Thus, an important corollary of this understanding that code review is a social activity, is that LLMs are not social actors, thus you cannot rely on code review to inspect their output.
My own personal preference would be to eschew their use entirely, but in the spirit of harm reduction, if youâre going to use LLMs to generate code, you need to remember the ways in which LLMs are not like human beings.
When you relate to a human colleague, you will expect that:
- you can make decisions about what to focus on based on their level of experience and areas of expertise to know what problems to focus on; from a late-career colleague you might be looking for bad habits held over from legacy programming languages; from an earlier-career colleague you might be focused more on logical test-coverage gaps,
- and, they will learn from repeated interactions so that you can gradually focus less on a specific type of problem once you have seen that theyâve learned how to address it,
With an LLM, by contrast, while errors can certainly be biased a bit by the prompt from the engineer and pre-prompts that might exist in the repository, the types of errors that the LLM will make are somewhat more uniformly distributed across the experience range.
You will still find supposedly extremely sophisticated LLMs making extremely common mistakes, specifically because they are common, and thus appear frequently in the training data.
The LLM also canât really learn. An intuitive response to this problem is to simply continue adding more and more instructions to its pre-prompt, treating that text file as its âmemoryâ, but that just doesnât work, and probably never will. The problem â âcontext rotâ is somewhat fundamental to the nature of the technology.
Thus, code-generators must be treated more adversarially than you would a human code review partner. When you notice it making errors, you always have to add tests to a mechanical, deterministic harness that will evaluates the code, because the LLM cannot meaningfully learn from its mistakes outside a very small context window in the way that a human would, so giving it feedback is unhelpful. Asking it to just generate the code again still requires you to review it all again, and as we have previously learned, you, a human, cannot review more than 400 lines at once.
To Sum Up
Code review is a social process, and you should treat it as such. When youâre reviewing code from humans, share knowledge and encouragement as much as you share bugs or unmet technical requirements.
If you must reviewing code from an LLM, strengthen your automated code-quality verification tooling and make sure that its agentic loop will fail on its own when those quality checks fail immediately next time. Do not fall into the trap of appealing to its feelings, knowledge, or experience, because it doesnât have any of those things.
But for both humans and LLMs, do not fall into the trap of thinking that your code review process is catching your bugs. Thatâs not its job.
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what youâve read here and youâd like to read more of it, or youâd like to support my various open-source endeavors, you can support my work as a sponsor!
Seth Michael Larson
Relative âDependency Cooldownsâ in pip v26.0 with crontab
WARNING: Most of this blog post is a hack, everyone should probably just wait for relative dependency cooldowns to come to a future version of pip.
pip v26.0 added support for the --uploaded-prior-to option.
This new option enables implementing âdependency cooldownsâ, a technique
described by William Woodruff, that provides simple but effective
protections for the relatively short attack-window time
of malware published to public software repositories. This brings the
reaction time to malware back within the realm of humans, who sometimes need to
execute manual triage processes to take down malware from PyPI.
So if you set --uploaded-prior-to to 7 days before this post
was published, February 25th, you'd do so like this:
python -m pip install \
--uploaded-prior-to=2026-02-25 \
urllib3
But this is only an absolute date, and we have to remember
to set the option on each call to pip install? That seems like
a lot of work!
Dependency cooldowns work best when the policy can be set in a global configuration file to a relative value like â7 daysâ. The âwindowâ of acceptable packages is then automatically updating over time without needing to set a new absolute value constantly. âSet-and-forgetâ-style.
uv allows setting a relative value via --exclude-newer,
but pip doesn't support relative ranges yet. I mostly use pip and still wanted to test this feature today, so I
created a little hack to update my user pip.conf configuration file
on a regular basis instead. Here's what my pip.conf file looks like:
[install]
uploaded-prior-to = 2026-02-25
And below is the entire Python script doing the updating. Quick reminder that I only tested this on my own system, your mileage may vary, do not use in production.
#!/usr/bin/python3
# License: MIT
import datetime
import sys
import os
import re
def main() -> int:
# Parse the command line options.
pip_conf = os.path.abspath(os.path.expanduser(sys.argv[1]))
days = int(sys.argv[2])
# Load the existing pip.conf file.
try:
with open(pip_conf, "r") as f:
pip_conf_data = f.read()
except FileNotFoundError:
print(f"Could not find pip.conf file at: {pip_conf}")
return 1
# Update the existing uploaded-prior-to value.
uploaded_prior_to_re = re.compile(
r"^uploaded-prior-to\s*=\s*2[0-9]{3}-[0-9]{2}-[0-9]{2}$", re.MULTILINE
)
if not uploaded_prior_to_re.search(pip_conf_data):
print("Could not find uploaded-prior-to option in pip.conf under [install]")
return 1
new_uploaded_prior_to = (
datetime.date.today() - datetime.timedelta(days=days)
).strftime("%Y-%m-%d")
pip_conf_data = uploaded_prior_to_re.sub(
f"uploaded-prior-to = {new_uploaded_prior_to}", pip_conf_data
)
# Write the new uploaded-prior-to
# value to pip.conf
with open(pip_conf, "w") as f:
f.write(pip_conf_data)
return 0
if __name__ == "__main__":
sys.exit(main())
The script takes two parameters, your pip.conf file you want
to update (typically ~/.config/pip/pip.conf on Linux) and
an integer number of days. I used 14 in my cron example below.
Simple right? I installed and chmod u+X-ed the script in my /usr/local/bin directory
and then added to my crontab using crontab -u (USERNAME) -e:
0 * * * * (/usr/local/bin/pip-dependency-cooldown /home/sethmlarson/.config/pip/pip.conf 14) 2>&1 | logger -t pip-dependency-cooldown
This pattern will run the script once per hour and update the value
of uploaded-prior-to to the new day. Now I only receive packages that
were published 14 or more days ago by default when running pip install without any other options.
Stay tuned for more about dependency cooldowns for Python installers once pip supports relative values.
Thanks for keeping RSS alive! â„
March 03, 2026
PyCoderâs Weekly
Issue #724: Unit Testing Performance, Ordering, FastAPI, and More (March 3, 2026)
#724 â MARCH 3, 2026
View in Browser »
Unit Testing: Catching Speed Changes
This second post in a series covers how to use unit testing to ensure the performance of your code. This post talks about catching differences in performance after code has changed.
ITAMAR TURNER-TRAURING
Lexicographical Ordering in Python
Python lexicographically orders tuples, strings, and all other sequences, comparing element-by-element. Learn what this means when you compare values or sort.
TREY HUNNER
A Cheaper Heroku? See for Yourself
Is PaaS too expensive for your Django app? We built a comparison calculator that puts the fully-managed hosting options head-to-head â
JUDOSCALE sponsor
Start Building With FastAPI
Learn how to build APIs with FastAPI in Python, including Pydantic models, HTTP methods, CRUD operations, and interactive documentation.
REAL PYTHON course
Python Jobs
Python + AI Content Specialist (Anywhere)
Articles & Tutorials
Serving Private Files With Django and S3
Djangoâs FileField and ImageField are good at storing files, but on their own they donât let you control access. Serving files from S3 just makes this more complicated. Learn how to secure a file behind your login wall.
RICHARD TERRY
FastAPI Error Handling: Types, Methods, and Best Practices
FastAPI provides various error-handling mechanisms to help you build robust applications. With built-in validation models, exceptions, and custom exception handlers, you can build robust and scalable FastAPI applications.
HONEYBADGER.IO âą Shared by Addison Curtis
CLI Subcommands With Lazy Imports
Python 3.15 will support lazy imports, meaning modules don’t get pulled in until they are needed. This can be particularly useful with Command Line Interfaces where a subcommand doesn’t need everything to be useful.
BRETT CANNON
How the Self-Driving Tech Stack Works
A technical guide to how self-driving cars actually work. CAN bus protocols, neural networks, sensor fusion, and control system with open source implementations, most of which can be accessed through Python.
CARDOG
Managing Shared Data Science Code With Git Submodules
Learn how to manage shared code across projects using Git submodules. Prevent version drift, maintain reproducible workflows, and support team collaboration with practical examples.
CODECUT.AI âą Shared by Khuyen Tran
Datastar: Modern Web Dev, Simplified
Talk Python interviews Delaney Gillilan, Ben Croker, and Chris May about the Datastar framework, a library that combines the concepts of HTMX, Alpine, and more.
TALK PYTHON podcast
Introducing the Zen of DevOps
Inspired by the Zen of Python, Tibo has written a Zen of DevOps, applying similar ideas from your favorite language to the world of servers and deployment.
TIBO BEIJEN
Stop Using Pickle Already. Seriously, Stop It!
Python’s Pickle is insecure by design, so using it in public facing code is highly problematic. This article explains why and suggests alternatives.
MICHAL NAZAREWICZ
Raw+DC: The ORM Pattern of 2026?
After 25+ years championing ORMs, Michael has switched to raw database queries paired with Python dataclasses. This post explains why.
MICHAEL KENNEDY
Projects & Code
Events
Weekly Real Python Office Hours Q&A (Virtual)
March 4, 2026
REALPYTHON.COM
Python Unplugged on PyTV
March 4 to March 5, 2026
JETBRAINS.COM
Canberra Python Meetup
March 5, 2026
MEETUP.COM
Sydney Python User Group (SyPy)
March 5, 2026
SYPY.ORG
PyDelhi User Group Meetup
March 7, 2026
MEETUP.COM
PyConf Hyderabad 2026
March 14 to March 16, 2026
PYCONFHYD.ORG
Happy Pythoning!
This was PyCoder’s Weekly Issue #724.
View in Browser »
[ Subscribe to đ PyCoder’s Weekly đ â Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
Rodrigo GirĂŁo SerrĂŁo
TIL #140 â Install Jupyter with uv
Today I learned how to install jupyter properly while using uv to manage tools.
Running a Jupyter notebook server or Jupyter lab
To run a Jupyter notebook server with uv, you can run the command
$ uvx jupyter notebook
Similarly, if you want to run Jupyter lab, you can run
$ uvx jupyter lab
Both work, but uv will kindly present a message explaining how it's actually doing you a favour, because it guessed what you wanted.
That's because uvx something usually looks for a package named âsomethingâ with a command called âsomethingâ.
As it turns out, the command jupyter comes from the package jupyter-core, not from the package jupyter.
Installing Jupyter
If you're running Jupyter notebooks often, you can install the notebook server and Jupyter lab with
$ uv tool install --with jupyter jupyter-core
Why uv tool install jupyter fails
Running uv tool install jupyter fails because the package jupyter doesn't provide any commands by itself.
Why uv tool install jupyter-core doesn't work
The command uv tool install jupyter-core looks like it works because it installs the command jupyter correctly.
However, if you use --help you can see that you don't have access to the subcommands you need:
$ uv tool install jupyter-core
...
Installed 3 executables: jupyter, jupyter-migrate, jupyter-troubleshoot
$ jupyter --help
...
Available subcommands: book migrate troubleshoot
That's because the subcommands notebook and lab are from the package jupyter.
The solution?
Install jupyter-core with the additional dependency jupyter, which is what the command uv tool install --with jupyter jupyter-core does.
Other usages of Jupyter
The uv documentation has a page dedicated exclusively to the usage of uv with Jupyter, so check it out for other use cases of the uv and Jupyter combo!
Django Weblog
Django security releases issued: 6.0.3, 5.2.12, and 4.2.29
In accordance with our security release policy, the Django team is issuing releases for Django 6.0.3, Django 5.2.12, and Django 4.2.29. These releases address the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.
CVE-2026-25673: Potential denial-of-service vulnerability in URLField via Unicode normalization on Windows
The django.forms.URLField form field's to_python() method used urllib.parse.urlsplit() to determine whether to prepend a URL scheme to the submitted value. On Windows, urlsplit() performs NFKC normalization (unicodedata.normalize), which can be disproportionately slow for large inputs containing certain characters.
URLField.to_python() now uses a simplified scheme detection, avoiding Unicode normalization entirely and deferring URL validation to the appropriate layers. As a result, while leading and trailing whitespace is still stripped by default, characters such as newlines, tabs, and other control characters within the value are no longer handled by URLField.to_python(). When using the default URLValidator, these values will continue to raise ValidationError during validation, but if you rely on custom validators, ensure they do not depend on the previous behavior of URLField.to_python().
This issue has severity "moderate" according to the Django Security Policy.
Thanks to Seokchan Yoon for the report.
CVE-2026-25674: Potential incorrect permissions on newly created file system objects
Django's file-system storage and file-based cache backends used the process umask to control permissions when creating directories. In multi-threaded environments, one thread's temporary umask change can affect other threads' file and directory creation, resulting in file system objects being created with unintended permissions.
Django now applies the requested permissions via os.chmod() after os.mkdir(), removing the dependency on the process-wide umask.
This issue has severity "low" according to the Django Security Policy.
Thanks to Tarek Nakkouch for the report.
Affected supported versions
- Django main
- Django 6.0
- Django 5.2
- Django 4.2
Resolution
Patches to resolve the issue have been applied to Django's main, 6.0, 5.2, and 4.2 branches. The patches may be obtained from the following changesets.
CVE-2026-25673: Potential denial-of-service vulnerability in URLField via Unicode normalization on Windows
- On the main branch
- On the 6.0 branch
- On the 5.2 branch
- On the 4.2 branch
CVE-2026-25674: Potential incorrect permissions on newly created file system objects
- On the main branch
- On the 6.0 branch
- On the 5.2 branch
- On the 4.2 branch
The following releases have been issued
- Django 6.0.3 (download Django 6.0.3 | 6.0.3 checksums)
- Django 5.2.12 (download Django 5.2.12 | 5.2.12 checksums)
- Django 4.2.29 (download Django 4.2.29 | 4.2.29 checksums)
The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E
General notes regarding security reporting
As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance, nor via the Django Forum. Please see our security policies for further information.
Real Python
What Does Python's __init__.py Do?
Python’s special __init__.py file marks a directory as a regular Python package and allows you to import its modules. This file runs automatically the first time you import its containing package. You can use it to initialize package-level variables, define functions or classes, and structure the package’s namespace clearly for users.
By the end of this video course, you’ll understand that:
- A directory without an
__init__.pyfile becomes a namespace package, which behaves differently from a regular package and may cause slower imports. - You can use
__init__.pyto explicitly define a package’s public API by importing specific modules or functions into the package namespace. - The Python convention of using leading underscores helps indicate to users which objects are intended as non-public, although this convention can still be bypassed.
- Code inside
__init__.pyruns only once during the first import, even if you run the import statement multiple times.
Understanding how to effectively use __init__.py helps you structure your Python packages in a clear, maintainable way, improving usability and namespace management.
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: Duck Typing in Python: Writing Flexible and Decoupled Code
In this quiz, you’ll test your understanding of Duck Typing in Python: Writing Flexible and Decoupled Code.
By working through this quiz, you’ll revisit what duck typing is and its pros and cons, how Python uses behavior-based interfaces, how protocols and special methods support it, and what alternatives you can use in Python.
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
PyBites
Why Building a Production RAG Pipeline is Easier Than You Think
Adding AI to legacy code doesn’t have to be a challenge.
Many devs are hearing this right now: âWe need to add AI to the app.â
And for many of them, panic ensues.
The assumption is that you have to rip your existing architecture down to its foundation. You start having nightmares about standing up complex microservices, massive AWS bills, and spending six months learning the intricate math behind vector embeddings.
It feels like a monumental risk to your stable, production-ready codebase, right?
Here’s the current reality though: adding AI to an existing application doesn’t actually require a massive rewrite.
If you have solid software design fundamentals, integrating a Retrieval-Augmented Generation (RAG) pipeline is entirely within your reach.
Here’s how you do it without breaking everything you’ve already built.
Get the Python Stack to do the Heavy Lifting
You don’t need to build your AI pipeline from scratch. The Python ecosystem has matured to the point where the hardest parts of a RAG pipeline are already solved for you.
- Need to parse massive PDFs? Libraries like
doclinghandle it practically out of the box. - Need to convert text into embeddings and store them? You let the LLM provider handle the embedding math, and you drop the results into a vector database.
It’s not a huge technical challenge in Python. It is just an orchestration of existing, powerful tools.
Stop Coding and Start Orchestrating
When a developer builds a RAG pipeline and the AI starts hallucinating, their first instinct is to dive into the code. They try to fix the API calls or mess with the vector search logic.
Take a step back. The code usually isn’t the problem.
The system prompt is the conductor of your entire RAG pipeline. It dictates how the LLM interacts with your vector database. If you’re getting bad results, you don’t need to rewrite your Python logic – you need to refine your prompt through trial and error to strict, data-grounded constraints.
Beat Infrastructure Constraints by Offloading
What if your app is hosted on something lightweight, like Heroku, with strict size and memory limits? You might think you need to containerise everything and migrate to a heavier cloud setup.
Nope! You just need to separate your concerns.
Indexing documents and generating embeddings is heavy work. Querying is light. Offload the heavy lifting (like storing and searching vectors) entirely to your vector database service (like Weaviate). This keeps your core app lightweight, so it only acts as the middleman routing the query.
We Broke Down Exactly How This Works With Tim Gallati
We explored the reality of this architecture with Tim Gallati and Pybites AI coach Juanjo on the podcast. Tim had an existing app, Quiet Links, running on Heroku.
In just six weeks with us, he integrated a massive, production-ready RAG pipeline into it, without breaking his existing user experience.
If you want to hear the full breakdown of how they architected this integration, listen to the episode using the player above, or at the following links:
March 02, 2026
Python Morsels
When are classes used in Python?
While you don't often need to make your own classes in Python, they can sometimes make your code reusable and easier to read.
Using classes because a framework requires it
Unlike many programming languages, you can accomplish quite a bit in Python without ever making a class.
So when are classes typically used?
The most common time to use a class in Python is when using a library or framework that requires writing a class.
For example, users of the Django web framework need to make a class to allow Django to manage their database tables:
from django.db import models
class Product(models.Model):
name = models.CharField(max_length=300)
price = models.DecimalField(max_digits=10, decimal_places=2)
description = models.TextField()
def __str__(self):
return self.name
With Django, each Model class is used for managing one table in the database.
Each instance of every model represents one row in each database table:
>>> from products.models import Product
>>> Product.objects.get(id=1)
<Product: rubber duck>
Many Python programmers create their first class because a library requires them to make one.
Passing the same data into multiple functions
But sometimes you may choose âŠ
Read the full article: https://www.pythonmorsels.com/when-are-classes-used/
Brett Cannon
State of WASI support for CPython: March 2026
It&aposs been a while since I posted about WASI support in CPython! 😅 Up until now, most of the work I have been doing around WASI has been making its maintenance easier for me and other core developers. For instance, the cpython-devcontainer repo now provides a WASI dev container so people don&apost have to install the WASI SDK to be productive (e.g. there&aposs a WASI codespace now so you can work on WASI entirely from your browser without installing anything). All this work around making development easier not only led to having WASI instructions in the devguide and the creation of a CLI app, but also a large expansion of how to use containers to do CPython development.
But the main reason I&aposm blogging today is that PEP 816 was accepted! That PEP defines how WASI compatibility will be handled starting with Python 3.15. The key point is that once the first beta for a Python version is reached, the WASI and WASI SDK versions that will be supported for the life of that Python version will be locked down. That gives the community a target to build packages for since things built with the WASI SDK are not forwards- or backwards-compatible for linking purposes due to wasi-libc not having any compatibility guarantees.
With that PEP out of the way, the next big items on my WASI todo list are (in rough order):
- Implement a subcommand to bundle up a WASI build for distribution
- Write a PEP to define a platform tag for wheels
- Implement a subcommand to build dependencies for CPython (e.g. zlib)
- Turn on socket support (which requires WASI 0.3 and threading support to be released as I&aposm skipping over WASI 0.2)
Rodrigo GirĂŁo SerrĂŁo
TIL #139 â Multiline input in the REPL
Today I learned how to do multiline input in the REPL using an uncommon combination of arguments for the built-in open.
A while ago I learned I could use open(0) to open standard input.
This unlocks a neat trick that allows you to do multiline input in the REPL:
>>> msg = open(0).read()
Hello,
world!
^D
>>> msg
'Hello,\nworld!\n'
The cryptic ^D is Ctrl+D, which means EOF on Unix systems.
If you're on Windows, use Ctrl+Z.
The problem is that if you try to use open(0).read() again to read more multiline input, you get an exception:
OSError: [Errno 9] Bad file descriptor
That's because, when you finished reading the first time around, Python closed the file descriptor 0, so you can no longer use it.
The fix is to set closefd=False when you use the built-in open.
With the parameter closefd set to False, the underlying file descriptor isn't closed and you can reuse it:
>>> msg1 = open(0, closefd=False).read()
Hello,
world!
^D
>>> msg1
'Hello,\nworld!\n'
>>> msg2 = open(0, closefd=False).read()
Goodbye,
world!
^D
>>> msg2
'Goodbye,\nworld!\n'
By using open(0, closefd=False), you can read multiline input in the REPL repeatedly.
Real Python
Automate Python Data Analysis With YData Profiling
The YData Profiling package generates an exploratory data analysis (EDA) report with a few lines of code. The report provides dataset and column-level analysis, including plots and summary statistics to help you quickly understand your dataset. These reports can be exported to HTML or JSON so you can share them with other stakeholders.
By the end of this tutorial, youâll understand that:
- YData Profiling generates interactive reports containing EDA results, including summary statistics, visualizations, correlation matrices, and data quality warnings from DataFrames.
ProfileReportcreates a profile you can save with.to_file()for HTML or JSON export, or display inline with.to_notebook_iframe().- Setting
tsmode=Trueand specifying a date column withsortbyenables time series analysis, including stationarity tests and seasonality detection. - The
.compare()method generates side-by-side reports highlighting distribution shifts and statistical differences between datasets.
To get the most out of this tutorial, youâll benefit from having knowledge of pandas.
Note: The examples in this tutorial were tested using Python 3.13. Additionally, you may need to install setuptools<81 for backward compatibility.
You can install this package using pip:
$ python -m pip install ydata-profiling
Once installed, youâre ready to transform any pandas DataFrame into an interactive report. To follow along, download the example dataset youâll work with by clicking the link below:
Get Your Code: Click here to download the free sample code and start automating Python data analysis with YData Profiling.
The following example generates a profiling report from the 2024 flight delay dataset and saves it to disk:
flight_report.py
import pandas as pd
from ydata_profiling import ProfileReport
df = pd.read_csv("flight_data_2024_sample.csv")
profile = ProfileReport(df)
profile.to_file("flight_report.html")
This code generates an HTML file containing interactive visualizations, statistical summaries, and data quality warnings:
You can open the file in any browser to explore your dataâs characteristics without writing additional analysis code.
There are a number of tools available for high-level dataset exploration, but not all are built for the same purpose. The following table highlights a few common options and when each one is a good fit:
| Use case | Pick | Best for |
|---|---|---|
| You want to quickly generate an exploratory report | ydata-profiling |
Generating exploratory data analysis reports with visualizations |
| You want an overview of a large dataset | skimpy or df.describe() |
Providing fast, lightweight summaries in the console |
| You want to enforce data quality | pandera |
Validating schemas and catching errors in data pipelines |
Overall, YData Profiling is best used as an exploratory report creation tool. If youâre looking to generate an overview for a large dataset, using SkimPy or a built-in DataFrame library method may be more efficient. Other tools, like Pandera, are more appropriate for data validation.
If YData Profiling looks like the right choice for your use case, then keep reading to learn about its most important features.
Take the Quiz: Test your knowledge with our interactive âAutomate Python Data Analysis With YData Profilingâ quiz. Youâll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Automate Python Data Analysis With YData ProfilingTest your knowledge of YData Profiling, including report creation, customization, performance optimization, time series analysis, and comparisons.
Building a Report With YData Profiling
Read the full article at https://realpython.com/ydata-profiling-eda/ »
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Rodrigo GirĂŁo SerrĂŁo
Remove extra spaces
Learn how to remove extra spaces from a string using regex, string splitting, a fixed point, and itertools.groupby.
In this article you'll learn about three different ways in which you can remove extra spaces from the middle of a string. That is, you'll learn how to go from a string like
string = "This is a perfectly normal sentence."
to a string like
string = "This is a perfectly normal sentence."
The best solution to remove extra spaces from a string
The best solution for this task, which is both readable and performant, uses the regex module re:
import re
def remove_extra_spaces(string):
return re.sub(" {2,}", " ", string)
The function sub can be used to substitute a pattern for a replacement you specify.
The pattern " {2,}" finds runs of 2 or more consecutive spaces and replaces them with a single space.
String splitting
Using the string method split can also be a good approach:
def remove_extra_spaces(string):
return " ".join(string.split(" "))
If you're using string splitting, you'll want to provide the space " " as an argument.
If you call split with no arguments, you'll be splitting on all whitespace, which is not what you want if you have newlines and other whitespace characters you should preserve.
This solution is great, except it doesn't work:
print(remove_extra_spaces(string))
# 'This is a perfectly normal sentence.'
The problem is that splitting on the space will produce a list with empty strings:
print(string.split(" "))
# ['This', 'is', '', 'a', '', '', 'perfectly', '', '', '', 'normal', '', '', '', '', 'sentence.']
These empty strings will be joined back together and you'll end up with the same string you started with. For this to work, you'll have to filter the empty strings first:
def remove_extra_spaces(string):
return " ".join(filter(None, string.split(" ")))
Using filter(None, ...) filters out the Falsy strings, so that the final joining operation only joins the strings that matter.
This solution has a problem, though, in that it will completely remove any leading or trailing whitespace, which may or may not be a problem.
The two solutions presented so far — using regular expressions and string splitting — are pretty reasonable. But they're also boring. You'll now learn about two other solutions.
Replacing spaces until you hit a fixed point
You can think about the task of removing extra spaces as the task of replacing extra spaces by the empty string.
And if you think about doing string replacements, you should think about the string method replace.
You can't do something like string.replace(" ", ""), otherwise you'd remove all spaces, so you have to be a bit more careful:
def remove_extra_spaces(string):
while True:
new_string = string.replace(" ", " ")
if new_string == string:
break
string = new_string
return string
You can replace two consecutive spaces by a single space, and you repeat this operation until nothing changes in your string.
The idea of running a function until its output doesn't change is common enough in maths that they call...
Real Python
Quiz: Automate Python Data Analysis With YData Profiling
In this quiz, you’ll test your understanding of Automate Python Data Analysis With YData Profiling.
By working through this quiz, you’ll revisit how to generate and display profile reports in a notebook, export reports to files, add column descriptions, and speed up profiling.
This quiz focuses on practical YData Profiling tasks such as rendering reports, comparing datasets, and preparing time series data. If you want a deeper walkthrough, review the tutorial linked above.
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Quiz: The pandas DataFrame: Make Working With Data Delightful
In this quiz, you’ll test your understanding of the pandas DataFrame.
By working through this quiz, you’ll review how to create pandas DataFrames, access and modify columns, insert and sort data, extract values as NumPy arrays, and how pandas handles missing data.
[ Improve Your Python With đ Python Tricks đ â Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Python Bytes
#471 The ORM pattern of 2026?
<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://mkennedy.codes/posts/raw-dc-the-orm-pattern-of-2026/?featured_on=pythonbytes">Raw+DC: The ORM pattern of 2026</a>?</strong></li> <li><strong><a href="https://github.com/okken/pytest-check/releases?featured_on=pythonbytes">pytest-check releases</a></strong></li> <li><strong><a href="https://dcw.ritviknag.com/en/latest/#">Dataclass Wizard</a></strong></li> <li><strong><a href="https://github.com/adamghill/sqliteo?featured_on=pythonbytes">SQLiteo</a> - ânative macOS SQLite browser built for normal peopleâ</strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=tZyf7KtTQVU' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="471">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a> <strong>Connect with the hosts</strong></li> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky) Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 11am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</li> </ul> <p><strong>Michael #1: <a href="https://mkennedy.codes/posts/raw-dc-the-orm-pattern-of-2026/?featured_on=pythonbytes">Raw+DC: The ORM pattern of 2026</a>?</strong></p> <ul> <li>ORMs/ODMs provide great support and abstractions for developers</li> <li>They are not the <em>native</em> language of agentic AI</li> <li>Raw queries are trained 100x+ more than standard ORMs</li> <li>Using raw queries at the data access optimizes for AI coding</li> <li>Returning some sort of object mapped to the data optimizes for type safety and devs</li> </ul> <p><strong>Brian #2: <a href="https://github.com/okken/pytest-check/releases?featured_on=pythonbytes">pytest-check releases</a></strong></p> <ul> <li>3 merged pull requests</li> <li>8 closed issues</li> <li>at one point got to 0 PRâs and 1 enhancement request</li> <li>Now back to 2 issues and 1 PR, but activity means itâs still alive and being used. so cool</li> <li>Check out <a href="https://github.com/okken/pytest-check/blob/main/changelog.md?featured_on=pythonbytes">changelog</a> for all mods</li> <li>A lot of changes around supporting mypy <ul> <li>Iâve decided to NOT have the examples be fully <code>--strict</code> as I find it reduces readability <ul> <li>See <code>tox.ini</code> for explanation</li> </ul></li> <li>But src is <code>--strict</code> clean now, so user tests can be <code>--strict</code> clean.</li> </ul></li> </ul> <p><strong>Michael #3: <a href="https://dcw.ritviknag.com/en/latest/#">Dataclass Wizard</a></strong></p> <ul> <li><strong>Simple, elegant wizarding tools for Pythonâs</strong> <code>dataclasses</code>.</li> <li>Features <ul> <li>đ Fast â code-generated loaders and dumpers</li> <li>đȘ¶ Lightweight â pure Python, minimal dependencies</li> <li>đ§ Typed â powered by Python type hints</li> <li>đ§ Flexible â JSON, YAML, TOML, and environment variables</li> <li>đ§Ș Reliable â battle-tested with extensive test coverage</li> </ul></li> <li><a href="https://dcw.ritviknag.com/en/latest/#no-inheritance-needed">No Inheritance Needed</a></li> </ul> <p><strong>Brian #4: <a href="https://github.com/adamghill/sqliteo?featured_on=pythonbytes">SQLiteo</a> - ânative macOS SQLite browser built for normal peopleâ</strong></p> <ul> <li>Adam Hill</li> <li>This is a fun tool, built by someone I trust.</li> <li>That trust part is something Iâm thinking about a lot in these days of dev+agent built tools</li> <li>Some notes on my thoughts when evaluating <ul> <li>I know mac rules around installing .dmg files not from the apple store are picky. <ul> <li>And I like that</li> </ul></li> <li>But Iâm ok with the override when something comes from a dev I trust</li> <li>The contributors are all Adam <ul> <li>Iâm still not sure how I feel about letting agents do commits in repos</li> </ul></li> <li>Thereâs âAGENTSâ folder and markdown files in the project for agents, so Ad</li> </ul></li> </ul> <p><strong>Extras</strong></p> <p>Michael:</p> <ul> <li><a href="https://lp.jetbrains.com/python-unplugged/?featured_on=pythonbytes">PyTV Python Unplugged This Week</a></li> <li><a href="https://www.techbuzz.ai/articles/ibm-crashes-11-as-anthropic-threatens-cobol-empire?featured_on=pythonbytes">IBM Crashes 11% in 4 Hours</a> - $24 Billion Wiped Out After Anthropic's Claude Code Threatens the Entire COBOL Consulting Industry</li> <li>Loving my <a href="https://www.amazon.com/dp/B0FJYNVR3R?ref_=ppx_hzsearch_conn_dt_b_fed_asin_title_1&featured_on=pythonbytes">40â ultrawide monitor</a> more every day</li> <li><a href="https://updatest.app?featured_on=pythonbytes">Updatest</a> for updating all the mac things</li> <li><a href="https://www.reddit.com/r/macapps/comments/1qwkq38/os_thaw_a_fork_of_ice_menu_bar_manager_for_macos/?featured_on=pythonbytes">Ice has Thawed out</a> (mac menubar app)</li> </ul> <p><strong>Joke: <a href="https://x.com/pr0grammerhum0r/status/2018852032304566331?s=12&featured_on=pythonbytes">House is read-only</a>!</strong></p>
March 01, 2026
Tryton News
Tryton News March 2026
In the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues - building on the changes from our last release. We also added some new features which we would like to introduce to you in this newsletter.
For an in depth overview of the Tryton issues please take a look at our issue tracker or see the issues and merge requests filtered by label.
Changes for the User
Sales, Purchases and Projects
Now we add the web shop URL to sales.
We now add a menu entry for party identifier.
Accounting, Invoicing and Payments
Now we can search for shipments on the invoice line.
In UBL we now set BillingReference to Invoice and Credit Note.
We now improve the layout of the invoice credit form.
Now we enforce the Peppol rule â[BR-27]-The Item net price (BT-146) shall NOT be negativeâ. So we make sure that the unit price on an invoice line is not negative.
We now add an Update Status button to the Peppol document.
Stock, Production and Shipments
Now we can charge duties for UPS for buyer or seller on import and export defined in the incoterms.
User Interface
Now we allow to re-order tabs in Sao.
In the favourites menu we now display a message how to use it, instead of showing an empty menu.
We also improve the blank state of notifications by showing a message.
Now we add a button in Sao for closing the search filter.
System Data and Configuration
Now we update the required version of python-stdnum to version 2.22 and introduced new party identifier.
New Releases
We released bug fixes for the currently maintained long term support series
7.0 and 6.0, and for the penultimate series 7.8 and 7.6.
Security
Please update your systems to take care of a security related bug we found last month.Changes for the System Administrator
Now we also display the create date and create time in the error list.
We now add basic authentication for user applications. Because in some cases the consumer of the user application may not be able to use the bearer authentication.
Writing compatible HTML for email can be very difficult. MJML provides a syntax to ease the creation of such emails. So now we support the MJML email format in Tryton.
Changes for Implementers and Developers
We now add a timestamp field on ModelStorage for last modified.
Now we introduce a new type of field for SQL expressions: Field.sql_column(tables, Model)
We now allow UserError and UserWarning exceptions to be raised on evaluating button inputs.
Now we replace the extension separator by an underscore in report names used as temporary files.
We now do no longer check for missing parent depends when the One2Many is readonly.
Now we preserve the line numbers when converting doctest files to python files.
1 post - 1 participant
February 28, 2026
Talk Python to Me
#538: Python in Digital Humanities
Digital humanities sounds niche, until you realize it can mean a searchable archive of U.S. amendment proposals, Irish folklore, or pigment science in ancient art. Today Iâm talking with David Flood from Harvardâs DARTH team about an unglamorous problem: What happens when the grant ends but the website canât. His answer, static sites, client-side search, and sneaky Python. Letâs dive in.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code talkpython26</a><br> <a href='https://talkpython.fm/commandbookapp'>Command Book</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>David Flood</strong>: <a href="https://www.davidaflood.com?featured_on=talkpython" target="_blank" >davidaflood.com</a><br/> <br/> <strong>DARTH</strong>: <a href="https://digitalhumanities.fas.harvard.edu?featured_on=talkpython" target="_blank" >digitalhumanities.fas.harvard.edu</a><br/> <strong>Amendments Project</strong>: <a href="https://digitalhumanities.fas.harvard.edu/projects/amend/?featured_on=talkpython" target="_blank" >digitalhumanities.fas.harvard.edu</a><br/> <strong>Fionn Folklore Database</strong>: <a href="https://fionnfolklore.org/en?featured_on=talkpython" target="_blank" >fionnfolklore.org</a><br/> <strong>Mapping Color in History</strong>: <a href="https://iiif.harvard.edu/projects/mapping-color-in-history/?featured_on=talkpython" target="_blank" >iiif.harvard.edu</a><br/> <strong>Apatosaurus</strong>: <a href="https://apatosaurus.io/?featured_on=talkpython" target="_blank" >apatosaurus.io</a><br/> <strong>Criticus</strong>: <a href="https://github.com/d-flood/criticus?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>github.com/palewire/django-bakery</strong>: <a href="https://github.com/palewire/django-bakery?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>sigsim.acm.org/conf/pads/2026/blog/artifact-evaluation</strong>: <a href="https://sigsim.acm.org/conf/pads/2026/blog/artifact-evaluation/?featured_on=talkpython" target="_blank" >sigsim.acm.org</a><br/> <strong>Hugo</strong>: <a href="https://gohugo.io?featured_on=talkpython" target="_blank" >gohugo.io</a><br/> <strong>Water Stories</strong>: <a href="https://waterstories.fas.harvard.edu/?featured_on=talkpython" target="_blank" >waterstories.fas.harvard.edu</a><br/> <strong>Tsumeb Mine Notebook</strong>: <a href="https://tmn.fas.harvard.edu/?featured_on=talkpython" target="_blank" >tmn.fas.harvard.edu</a><br/> <strong>Dharma and Punya</strong>: <a href="https://dharmapunya2019.org/?featured_on=talkpython" target="_blank" >dharmapunya2019.org</a><br/> <strong>Pagefind library</strong>: <a href="https://pagefind.app?featured_on=talkpython" target="_blank" >pagefind.app</a><br/> <strong>django_webassembly</strong>: <a href="https://github.com/m-butterfield/django_webassembly?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Astro Static Site Generator</strong>: <a href="https://astro.build?featured_on=talkpython" target="_blank" >astro.build</a><br/> <strong>PageFind Python Lib</strong>: <a href="https://pypi.org/project/pagefind/?featured_on=talkpython" target="_blank" >pypi.org</a><br/> <strong>Frozen-Flask</strong>: <a href="https://frozen-flask.readthedocs.io/en/latest/?featured_on=talkpython" target="_blank" >frozen-flask.readthedocs.io</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=ZaI2AxRq_OA" target="_blank" >youtube.com</a><br/> <strong>Episode #538 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/538/python-in-digital-humanities#takeaways-anchor" target="_blank" >talkpython.fm/538</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/538/python-in-digital-humanities" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>đ„ Served in a Flask đž</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

Author:
Lucy Liu

