skip to navigation
skip to content

Planet Python

Last update: August 20, 2025 10:42 AM UTC

August 19, 2025


PyCoder’s Weekly

Issue #695: Subinterpreters, Asyncio, Pytest, and More (Aug. 19, 2025)

#695 – AUGUST 19, 2025
View in Browser »

The PyCoder’s Weekly Logo


Subinterpreters and Asyncio

Subinterpreters are new and not well understood by the community, a library to abstract away some of the complexities is needed, and asyncio is one way to do that.
JAMIE CHANG • Shared by Jamie Chang

Python’s asyncio: A Hands-on Walkthrough

Explore how Python asyncio works and when to use it. Follow hands-on examples to build efficient programs with coroutines and awaitable tasks.
REAL PYTHON

Starting With Pytest’s Parametrize

Pytest’s parametrize feature is powerful but it looks scary. This step-by-step explanation can help you learn how to use it.
NED BATCHELDER

PEP 802: Display Syntax for the Empty Set (Added)

PYTHON.ORG

Python 3.13.7 Released

PYTHON.ORG

Python 3.14.0rc2 Released

CPYTHON DEV BLOG

Articles & Tutorials

PyPI Now Serves Project Status Markers in API Responses

Project status markers were defined in PEP 792. They are metadata about the state of a project in the repository. For example, if a project becomes archived or quarantined, this can now be expressed in the package data. PyPI has added this info to the API calls as well.
WILLIAM WOODRUFF

Transferring “UTF8.XYZ”

UTF8.XYZ is a simple web service to help you easily find, and copy/paste emoji and other Unicode characters. The service was created by Seth Larson, and it is now being maintained by Trey Hunner, both big names in the Python community.
SETH LARSON

Test & Code Final Episode

After 10 years and 237 episodes, Brian Okken has decided to stop recording Test & Code. He’ll still be contributing to Python Bytes. Here’s to all his work on a great podcast over the last decade.
BRIAN OKKEN podcast

Inside HRT’s Python Fork for Faster Imports

Hudson River Trading has forked Python for internal use to implement the rejected PEP 690 which proposed lazy imports. Lazy imports cause complications but in their case meant a large speedup.
NOAH KIM

Announcing the PSF Board Candidates for 2025

The Python Software Foundation elections are upon us, this post announces this year’s candidates. Voting is September 2nd through the 16th. To vote you must be registered by August 26th.
PYTHON SOFTWARE FOUNDATION

User Timezones in Django

If your Django site has users from across different timezones, you may need to give them the ability to choose times locally. This post steps you through how to do that in Django.
DJANGOTRICKS

How to Install Python CLI Tools Globally

Ubuntu 24.04 enforces PEP 668, blocking global pip installs. Learn how to use pipx, uv, and other modern methods to install Python CLI tools globally while staying compliant.
SAADMAN RAFAT • Shared by Saadman Rafat

How JIT Builds of CPython Actually Work

You don’t have to be a compiler engineer to understand how your code runs in a JIT build of CPython, this article runs you through just what happens under the covers.
SAVANNAH BAILEY

Working With Python’s .__dict__ Attribute

Explore Python’s .__dict__ attribute to manage class and instance attributes directly for more flexible, low-level control of your objects.
REAL PYTHON course

pyx: A Python-Native Package Registry, Now in Beta

The folks at Astral, who brought you uv and more, have created a new commercial Python-native package registry called “pyx”.
ASTRAL

Enjoy Better UX With Wagtail 7.1

From more collapsible blocks to better image sorting, version 7.1 has a lot of things users will love.
MEAGEN VOSS

The Unburdened Village

Navigating Python’s import system and namespace packages
VIVIS DEV • Shared by Vivis Dev

Projects & Code

trio: Library for Async Concurrency and I/O

GITHUB.COM/PYTHON-TRIO

flower: Real-Time Monitor and Web Admin for Celery

GITHUB.COM/MHER

localstack: Fully Functional Local AWS Cloud Stack

GITHUB.COM/LOCALSTACK

faststream: Event API for Kafka, RabbitMQ, NATS & Redis

GITHUB.COM/AG2AI

notata: Store Simulation Results in a Standardized Format

GITHUB.COM/ALONFNT

Events

AI Agent Showdown on Aug. 19 at 1pm ET

August 19, 2025 at 1PM ET
8THLIGHT.COM • Shared by Laura Stephens

Weekly Real Python Office Hours Q&A (Virtual)

August 20, 2025
REALPYTHON.COM

PyData Bristol Meetup

August 21, 2025
MEETUP.COM

PyLadies Dublin

August 21, 2025
PYLADIES.COM

PyCon Togo 2025

August 23 to August 24, 2025
PYTOGO.ORG

PyCon Poland 2025

August 28 to September 1, 2025
PYCON.ORG

PyCon Kenya 2025

August 28 to August 31, 2025
PYCON.KE

PyCon Greece 2025

August 29 to August 31, 2025
PYCON.GR


Happy Pythoning!
This was PyCoder’s Weekly Issue #695.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

August 19, 2025 07:30 PM UTC


Python Engineering at Microsoft

Announcing the Data Wrangler powered Notebook Results Table

If you have ever found yourself rewriting the last line of a notebook cell repeatedly just to get an overview of your data, you’re not alone. In VS Code the default output for Pandas DataFrames is a static, truncated HTML table and it often fails to answer essential questions, such as:

Check out how Data Wrangler integrates seamlessly with notebooks in VS Code to enable you to answer these questions quickly and easily, with just a few clicks.

Seamless integration with notebooks

The new experience seamlessly replaces the static HTML output for Pandas DataFrames, only where applicable, and without any additional actions. Just make sure the Data Wrangler extension is installed 😊

GIF demonstrating viewing the contents of a Pandas DataFrame directly from within the Notebooks output cell using the embedded Data Wrangler UI.

Column sorting and filtering

There is no need to write code for sorting and filtering. You can just click around the interactive UI as you explore the data.

GIF demonstrating sorting the host_name column and filtering the host_about column.

Missing (blank) and distinct values are auto detected

You can instantly know if a column contains missing (blank) values or repeating values you did not expect just by glancing at the column header.

Static image demonstrating column statistics about the columns's values.

Deep data insights at your fingertips

Access summaries, statistics, histograms, frequency, and more, all instantly and without leaving the context of your notebook cell.

Static image demonstrating the Data Summary panel which contains statistics about the selected column 'reviews_per_month'.

Full view mode + Copilot

With just one click you can jump into the full Data Wrangler experience for even more data cleaning operations and Copilot powered data cleaning. Going back to the notebook view is just one click away.

GIF demonstrating using Copilot within Data Wrangler to create a new column called 'price_clean' by removing currency signs and punctuation from the existing column 'price'.

Export as csv or parquet

Export your data as CSV or Parquet files for further analysis or to feed it into a pipeline.

GIF demonstrating aggregating the data by price and number of bedrooms and then exporting as csv.

Getting started today

To try out this experience today, make sure you have the free Data Wrangler extension for VS Code installed. Then, run any Pandas DataFrame in your Jupyter notebook inside VS Code, and watch as Data Wrangler immediately enhances the output with powerful, interactive features (running a cell with just your DataFrame df is enough to get started).

As we iterate to make Data Wrangler the best data exploration and preparation tool, we want to hear from you! If you have any feedback about this experience, please let us know in our GitHub repository.

Elevate your data science workflow and enjoy a more intuitive way to work with your data today!

The post Announcing the Data Wrangler powered Notebook Results Table appeared first on Microsoft for Python Developers Blog.

August 19, 2025 07:08 PM UTC


Ruslan Spivak

5 to 18: Why Your Count Might Be Off by One

Quick question:

How many numbers are there from 5 to 18, including both ends?

Your first instinct might be to subtract:

18 - 5 = 13

Feels right.

But it’s wrong.

It’s a small thing, and kind of basic, but this mistake got me more times than I’d like to admit. Eventually I learned how to count ranges properly. :)

Basic counting

Let’s start with something simpler.

How many numbers are in this list?

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13

You’d probably say 13 without counting. And you’d be absolutely right.

That kind of range is easy. Our brain sees the pattern and knows what’s going on. We’ve been counting this way since we were little kids. But that instinct quietly fails in cases like 5 to 18.

So, how many numbers are there between 5 and 18 inclusive?

Let’s count manually

Here’s the full list from 5 to 18:

5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18

That’s 14 numbers, not 13. So what went wrong with our subtraction?

Try a quick trick

Let’s take the above list from 5 to 18 and turn it into a list that we know how to count by subtracting 4 from every number (I first saw this approach in David Patrick’s book Introduction to Counting & Probability. A great resource if you enjoy these kinds of problems.):

Much easier to count! It has 14 items. Since we only shifted the numbers to start at 1 (without changing the count), the original list has 14 numbers too. Nice.

The formula

If you want to count how many numbers are in a list from a to b, inclusive, here’s the rule (given that both a and b are positive and b >= a):

That +1 is the key.

So for our original example:

18 - 5 + 1 = 14

For the curious: formula derivation

Using the same trick, we subtract a - 1 from each number in the range from a to b. This transforms it into a list we can count easily, starting from 1:

Our new list has b - a + 1 numbers, so the original list has the same count.

The +1 rule is simple, but surprisingly easy to overlook. Here’s where it often sneaks in.

Where this trips people up

This tiny +1 mistake shows up everywhere:

Miss it, and you might:


It’s such a common mistake, it even has a name: the off-by-one error. Software engineers run into it constantly, but it affects everyone.

Once you understand it, you’ll start spotting it everywhere. In code, in calendars, in life.

P.S. If you feel extra adventurous, try the following exercises:

  1. How many numbers are in the range from 42 to 58, inclusive?
  2. Derive the formula b - a + 1 from scratch, no peeking
  3. How many numbers are in the list 6, 8, 10, 12, …, 128, 130? (Hint: What do you need to do before applying the formula b - a + 1?)


Stay tuned for more. And count carefully.

Originally published in my newsletter Beyond Basics. If you’d like to get future posts like this by email, you can subscribe here.

August 19, 2025 03:37 PM UTC


Real Python

Deep vs Shallow Copies in Python

When working with Python objects, you’ll often need to make copies rather than modify the originals. In this video course, you’ll explore various ways to copy objects in Python, including using the built-in copy module. You’ll also learn the key differences between shallow and deep copies, with practical examples so you can safely duplicate objects in your own code.

By the end of this video course, you’ll understand that:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 19, 2025 02:00 PM UTC


Talk Python to Me

#516: Accelerating Python Data Science at NVIDIA

Python’s data stack is getting a serious GPU turbo boost. In this episode, Ben Zaitlen from NVIDIA joins us to unpack RAPIDS, the open source toolkit that lets pandas, scikit-learn, Spark, Polars, and even NetworkX execute on GPUs. We trace the project’s origin and why NVIDIA built it in the open, then dig into the pieces that matter in practice: cuDF for DataFrames, cuML for ML, cuGraph for graphs, cuXfilter for dashboards, and friends like cuSpatial and cuSignal. We talk real speedups, how the pandas accelerator works without a rewrite, and what becomes possible when jobs that used to take hours finish in minutes. You’ll hear strategies for datasets bigger than GPU memory, scaling out with Dask or Ray, Spark acceleration, and the growing role of vector search with cuVS for AI workloads. If you know the CPU tools, this is your on-ramp to the same APIs at GPU speed.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/workbench'>Posit</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading">Links from the show</h2> <div><strong>RAPIDS</strong>: <a href="https://github.com/rapidsai?featured_on=talkpython" target="_blank" >github.com/rapidsai</a><br/> <strong>Example notebooks showing drop-in accelerators</strong>: <a href="https://github.com/rapidsai-community/showcase/tree/main?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Benjamin Zaitlen - LinkedIn</strong>: <a href="https://www.linkedin.com/in/benjamin-zaitlen-62ab7b4/?featured_on=talkpython" target="_blank" >linkedin.com</a><br/> <strong>RAPIDS Deployment Guide (Stable)</strong>: <a href="https://docs.rapids.ai/deployment/stable/?featured_on=talkpython" target="_blank" >docs.rapids.ai</a><br/> <strong>RAPIDS cuDF API Docs (Stable)</strong>: <a href="https://docs.rapids.ai/api/cudf/stable/?featured_on=talkpython" target="_blank" >docs.rapids.ai</a><br/> <strong>Asianometry YouTube Video</strong>: <a href="https://www.youtube.com/watch?v=SOQ6F7HMfSc&ab_channel=Asianometry" target="_blank" >youtube.com</a><br/> <strong>cuDF pandas Accelerator (Stable)</strong>: <a href="https://docs.rapids.ai/api/cudf/stable/cudf_pandas/?featured_on=talkpython" target="_blank" >docs.rapids.ai</a><br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=pl87nTWxqs8" target="_blank" >youtube.com</a><br/> <strong>Episode #516 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/516/accelerating-python-data-science-at-nvidia#takeaways-anchor" target="_blank" >talkpython.fm/516</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/516/accelerating-python-data-science-at-nvidia" target="_blank" >talkpython.fm</a><br/> <strong>Developer Rap Theme Song: Served in a Flask</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>--- Stay in touch with us ---</strong><br/> <strong>Subscribe to Talk Python on YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" >youtube.com</a><br/> <strong>Talk Python on Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm at bsky.app</a><br/> <strong>Talk Python on Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes at bsky.app</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>

August 19, 2025 08:00 AM UTC


Seth Michael Larson

Extracting Genesis & Game Gear ROMs from SEGA GameCube collections

The GameCube library had multiple SEGA game collections: Sonic Mega Collection, Sonic Adventure DX, and Sonic Gems Collection which all contain ROM files for the Genesis and Game Gear (among others).

We'll extract the ROMs from each collection and then compare these GameCube collections to modern SEGA game collections like Sonic Origins Plus and the Genesis collection for Nintendo Switch Online.

Game Price Platforms ROM Prices
Sonic Mega Collection $10 Genesis $225
Sonic Adventure DX $34 Game Gear $334
Sonic Gems Collection $36 Genesis, Game Gear, Sega CD $368
Sonic Origins Plus $40 Genesis, Game Gear, Sega CD $431
Nintendo Switch Online+ $50/year Genesis (and others) $186

TLDR: If you're picking one of above collections to buy for ROMs, the Sonic Mega Collection is my recommendation. 10 Genesis games with ROMs for $10 is an incredible deal, and it includes three of the best games in the Genesis library being Sonic the Hedgehog 2, Sonic 3 & Knuckles, and Ristar.

Sonic Origins Plus is pretty good deal for the games in the collection. If you're deciding between Origins and Sonic Adventure DX you're basically picking between Sonic 1, 2, 3 and CD versus the main story of Sonic Adventure DX. I'd probably pick Sonic Origins Plus over Sonic Adventure DX. If you do buy Sonic Origins Plus for ROMs, make sure you buy from Steam.

Sonic Gems Collection is a strange one. Two-thirds of the value comes from two games: Sonic R and Tails' Skypatrol, which might be up your alley, but if not then this collection isn't for you. The Sonic CD, Sonic R, and Sonic the Fighters files included aren't actual ROMs, instead they are DOL files so need to be played on a GameCube emulator.

Genesis for Nintendo Switch Online isn't worth it right now especially as a yearly subscription. The collection is sorely missing Sonic the Hedgehog 3 and Sonic & Knuckles. Over time this collection might be actually worth the money, but in all honesty it's tough to beat the value of the Sonic Mega Collection.

Buying any of these collections is an affordable and quality starting point of a ROM collection for an emulator. Here's a breakdown of which games are in which collection and their prices:

Game Platform Price Mega SADX Gems Origins NSO+
Dr. Robotnik's Mean Bean Machine Genesis $18
Flicky Genesis $25
Ristar Genesis $75
Sonic & Knuckles Genesis $29 ✅ (1)
Sonic 3D Blast Genesis $12
Sonic Spinball Genesis $8
Sonic the Hedgehog Genesis $13
Sonic the Hedgehog 2 Genesis $10
Sonic the Hedgehog 3 Genesis $25 ✅ (1)
Vectorman Genesis $10 (2)
Vectorman 2 Genesis $12 (2)
Bonanza Bros (JP, PAL) Genesis $13 (3)
Streets of Rage Genesis $24 (3)
Streets of Rage 2 Genesis $24 (3)
Streets of Rage 3 Genesis $52 (3)
Sonic the Fighters Arcade ???
Sonic R Saturn $100
Sonic CD Sega CD $40
Dr. Robotnik's Mean Bean Machine Game Gear $16
Sonic Blast Game Gear $29
Sonic Chaos Game Gear $10
Sonic Drift (JP) Game Gear $11
Sonic Drift 2 Game Gear $21
Sonic Labyrinth Game Gear $28
Sonic Spinball Game Gear $11
Sonic Triple Trouble Game Gear $12
Sonic the Hedgehog Game Gear $14
Sonic the Hedgehog 2 Game Gear $6
Tails Adventure Game Gear $60
Tails' Skypatrol (JP) Game Gear $96
Totals $691 $225 $352 $314 $431 $186

NOTE (1): The soundtrack for Sonic the Hedgehog 3 (and thus, Sonic 3 & Knuckles) has been changed since the original release. I prefer the original soundtrack, so if you do too then know that it's not included with "Sonic Origins Plus".

NOTE (2): Japanese editions of Sonic Mega Collection also contain both Vectorman and Vectorman 2.

NOTE (3): Japanese editions of Sonic Gems Collection also contain Bonanza Bros and Streets of Rage 1, 2, and 3.

Sonic Mega Collection

Sonic Mega Collection contains the ROMs for 10 Genesis games that are all compressed using SEGA "PRS" compression. Unfortunately there's no magic string for PRS compression, you just have to know that's what is being used:

Not only does Sonic Mega Collection contain ROMs from actual cartridges, the collection also includes the "ROMs" knu_p1u.md, knu_p2u.md, and knu_p3u.md which are the result from combining Sonic & Knuckles with Sonic the Hedgehog 1, 2 and 3 using "LOCK-ON Technology".

After decompressing the ROMs you should end up with files with these checksums:

Filename Game Length MD5
drobotnk.md Dr. Robotnik's Mean Bean Machine 1048576 4D6BDAC51D2F5969A91496142EA53232
flicky.md Flicky 131072 805CC0B3724F041126A57A4D956FD251
knu_p1u.md Sonic & Knuckles + Sonic The Hedgehog 2621440 17FFE04BB891253E7AC3FB0952BB2EDB
knu_p2u.md Sonic & Knuckles + Sonic The Hedgehog 2 3407872 3E5E4B18D035775B916A06F2B3DC5031
knu_p3u.md Sonic & Knuckles + Sonic The Hedgehog 3 4194304 C5B1C655C19F462ADE0AC4E17A844D10
ristaru.md Ristar 2097152 62E40B8C8012D02DF4FAC1C68F10EB16
s_spnu.md Sonic Spinball 1048576 841E347B30A6E298EE2B0C722F19FE74
sonic1j.md Sonic The Hedgehog (Japan, Europe, Korea) 524288 09DADB5071EB35050067A32462E39C5F
sonic1jb.md Sonic The Hedgehog (USA, Europe) 524288 C6C15AEA60BDA10AE11C6BC375296153
sonic1u.md Sonic The Hedgehog (World) 524288 1BC674BE034E43C96B86487AC69D9293
sonic2jb.md Sonic The Hedgehog 2 1048576 2A4CC74873D3117629E454666E39E654
sonic3d.md Sonic 3D Blast 4194304 50ACBEA2461D179B2BF11460A1CC6409
sonic3u.md Sonic The Hedgehog 3 2097152 D724EA4DD417FE330C9DCFD955C596B2

Sonic Adventure DX

Sonic Adventure DX contains the ROMs for 12 Game Gear games. Extracting them is similar to Sonic Mega Collection, but the PRS compression has changed slightly so you need a different decompression script.

Filename Game Length MD5
g-sonic.gg Sonic Blast 1048576 56E1561D981A9A7C240CDD8D5580CCF0
labylin.gg Sonic Labyrinth 524288 AA6598585B2CDB92A14476BEEB34991B
mbmachin.gg Dr. Robotnik's Mean Bean Machine 262144 6565DDCB2E41BF9CE5771A62D30AA700
s-drift2.gg Sonic Drift 2 524288 B39D1E9A40DFD3508EE4003CD28DA452
s-tail2.gg Sonic The Hedgehog - Triple Trouble 524288 F0F7E4DFE2908F0030E64FD9ED843422
skypat.gg Tails' Skypatrol (Japan) 262144 FFB364BBAF72881CF7571E7EC388490D
sonic-ch.gg Sonic Chaos 524288 895CA34F733C445341E5CA1642527690
sonic.gg Sonic The Hedgehog 262144 B1DE7027824C434CE8DE59782705F5C9
sonic2.gg Sonic The Hedgehog 2 524288 9C64846563D8B9A24400471322E53FB5
sonic_tt.gg Sonic & Tails 2 (Japan) 524288 05F5B6201CCECE80FACAF99FA7CF5A6E
sonicdri.gg Sonic Drift (Japan) 524288 8093EF0EEB147F2A938FAE3E5A26D8B3
sonictai.gg Sonic & Tails (Japan) 524288 AAC8371D2179295159C73AE2CB92892D
spinball.gg Sonic Spinball 524288 8E3B44FDA375AD9748B678E9C6B45502
tailsadv.gg Tails Adventure 524288 A8BDB1BEED088FF83C725C5AF6B85E1F

Sonic Gems Collection

Unfortunately the ROMs in the Sonic Gems Collection aren't simply compressed, they are obfuscated using a custom routine. The most interesting titles that distinguish Sonic Gems Collection from others (Sonic CD and Sonic R) aren't stored as ROMs for their respective systems. Instead, these games appear to be compiled as DOL files, meaning they can't be played using a typical Sega CD or Sega Saturn emulator.

There appear to be many more ROMs than are playable within the game itself, maybe hinting that this collection is more worth it than at first glance?



Thanks for keeping RSS alive! ♥

August 19, 2025 12:00 AM UTC

August 18, 2025


Python Software Foundation

The 2024 Python Developer Survey Results are here!

We are excited to share the results of the eighth official annual Python Developers Survey. This survey is done yearly as a collaborative effort between the Python Software Foundation and JetBrains. Check out Michael Kennedy's The State of Python 2025 blog post, on the JetBrains blog, which reflects his personal analysis and opinions on the data. Responses were collected October and November of 2024. More than 30,000 Python developers and enthusiasts from almost 200 countries and regions participated in the survey to reveal the current state of the language and the ecosystem around it.

Check out the survey results!

The survey aims to map the Python landscape and covers the following topics:

We encourage you to check out the methodology and the raw data for this year's Python Developers Survey, as well as those from past years (2023, 2022, 2021, 2020, 2019, 2018, and 2017). We would love to hear about what you learn by digging into the numbers! Share your results and comments with us on social media by mentioning JetBrains (LinkedIn, X) and the PSF (Mastodon, LinkedIn, X) using the #pythondevsurvey hashtag. Based on the feedback we received last year, we made adjustments to the 2024 survey- so we welcome suggestions and feedback that could help us improve again for next year!

August 18, 2025 04:51 PM UTC


PyCharm

This is a guest post from Michael Kennedy, the founder of Talk Python and a PSF Fellow.

State of Python 2025

Welcome to the highlights, trends, and key actions from the eighth annual Python Developers Survey. This survey is conducted as a collaborative effort between the Python Software Foundation and JetBrains’ PyCharm team.

My name is Michael Kennedy, and I’ve analyzed the more than 30,000 responses to the survey and pulled out the most significant trends and predictions, and identified various actions that you can take to improve your Python career.

I am in a unique position as the host of the Talk Python to Me podcast. Every week for the past 10 years, I’ve interviewed the people behind some of the most important libraries and language trends in the Python ecosystem. In this article, my goal is to use that larger community experience to understand the results of this important yearly survey.

If your job or products and services depend on Python, or developers more broadly, you’ll want to read this article. It provides a lot of insight that is difficult to gain from other sources.

Key Python trends in 2025

Let’s dive into the most important trends based on the Python survey results. 

Key Python trends in 2025

As you explore these insights, having the right tools for your projects can make all the difference. Try PyCharm for free and stay equipped with everything you need for data science, ML/AI workflows, and web development in one powerful Python IDE.

Python people use Python

Let’s begin by talking about how central Python is for people who use it. Python people use Python primarily. That might sound like an obvious tautology. However, developers use many languages that are not their primary language. For example, web developers might use Python, C#, or Java primarily, but they also use CSS, HTML, and even JavaScript.

On the other hand, developers who work primarily with Node.js or Deno also use JavaScript, but not as their primary language.

The survey shows that 86% of respondents use Python as their main language for writing computer programs, building applications, creating APIs, and more.

Python usage stats

We are mostly brand-new programmers

For those of us who have been programming for a long time – I include myself in this category, having written code for almost 30 years now – it’s easy to imagine that most people in the industry have a decent amount of experience. It’s a perfectly reasonable assumption. You go to conferences and talk with folks who have been doing programming for 10 or 20 years. You look at your colleagues, and many of them have been using Python and programming for a long time.

But that is not how the broader Python ecosystem looks.

Exactly 50% of respondents have less than two years of professional coding experience! And 39% have less than two years of experience with Python (even in hobbyist or educational settings).

Python demographic stats

This result reaffirms that Python is a great language for those early in their career. The simple (but not simplistic) syntax and approachability really speak to newer programmers as well as seasoned ones. Many of us love programming and Python and are happy to share it with our newer community members.

However, it suggests that we consider these demographics when we create content for the community. If you create a tutorial or video demonstration, don’t skimp on the steps to help people get started. For example, don’t just tell them to install the package. Tell them that they need to create a virtual environment, and show them how to do so and how to activate it. Guide them on installing the package into that virtual environment.

If you’re a tool vendor such as JetBrains, you’ll certainly want to keep in mind that many of your users will be quite new to programming and to Python itself. That doesn’t mean you should ignore advanced features or dumb down your products, but don’t make it hard for beginners to adopt them either.

Data science is now over half of all Python

This year, 51% of all surveyed Python developers are involved in data exploration and processing, with pandas and NumPy being the tools most commonly used for this.

Many of us in the Python pundit space have talked about Python as being divided into thirds: One-third web development, one-third data science and pure science, and one-third as a catch-all bin.

We need to rethink that positioning now that one of those thirds is overwhelmingly the most significant portion of Python.

This is also in the context of not only a massive boom in the interest in data and AI right now, but a corresponding explosion in the development of tools to work with in this space. There are data processing tools like Polars, new ways of working with notebooks like Marimo, and a huge number of user friendly packages for working with LLMs, vision models, and agents (e.g. Transformers, Diffusers, smolagents, LangChain/LangGraph, LlamaIndex).

Python’s center of gravity has indeed tilted further toward data/AI.

Most still use older Python versions despite benefits of newer releases 

The survey shows a distribution across the latest and older versions of the Python runtime. Many of us (15%) are running on the very latest released version of Python, but more likely than not, we’re using a version a year old or older (83%).

Python version usage stats

The survey also indicates that many of us are using Docker and containers to execute our code, which makes this 83% or higher number even more surprising. With containers, just pick the latest version of Python in the container. Since everything is isolated, you don’t need to worry about its interactions with the rest of the system, for example, Linux’s system Python. We should expect containerization to provide more flexibility and ease our transition towards the latest version of Python.

So why haven’t people updated to the latest version of Python? The survey results give two primary reasons.

  1. The version I’m using meets all my needs (53%)
  2. I haven’t had the time to update (25%)

The 83% of developers running on older versions of Python may be missing out on much more than they realize. It’s not just that they are missing some language features, such as the except keyword, or a minor improvement to the standard library, such as tomllib. Python 3.11, 3.12, and 3.13 all include major performance benefits, and the upcoming 3.14 will include even more.

What’s amazing is you get these benefits without changing your code. You simply choose a newer runtime, and your code runs faster. CPython has been extremely good at backward compatibility. There’s rarely significant effort involved in upgrading. Let’s look at some numbers.

48% of people are currently using Python 3.11. Upgrading to 3.13 will make their code run ~11% faster end to end while using ~10-15% less memory.

If they are one of the 27% still on 3.10 or older, their code gets a whopping ~42% speed increase (with no code changes), and memory use can drop by ~20-30%!

So maybe they’ll still come back to “Well, it’s fast enough for us. We don’t have that much traffic, etc.”. But if they are like most medium to large businesses, this is an incredible waste of cloud compute expense (which also maps to environmental harm via spent energy).

Research shows some estimates for cloud compute (specifically computationally based):

If we assume they’re running Python 3.10, that’s potentially $420,000 and $5.6M in savings, respectively (computed as 30% of the EC2 cost).

If your company realizes you are burning an extra $0.4M-$5M a year because you haven’t gotten around to spending the day it takes to upgrade, that’ll be a tough conversation.

Finances and environment aside, it’s really great to be able to embrace the latest language features and be in lock-step with the core devs’ significant work. Make upgrading a priority, folks.

Python web devs resurgence

For the past few years, we’ve heard that the significance of web development within the Python space is decreasing. Two powerful forces could be at play here: 1) As more data science and AI-focused people come to Python, the relatively static number of web devs represents a lower percentage, and 2) The web continues to be frontend-focused, and until Python in the browser becomes a working reality, web developers are likely to prefer JavaScript.

Looking at the numbers from 2021–2023, the trend is clearly downward 45% → 43% → 42%. But this year, the web is back! Respondents reported that 46% of them are using Python for web development in 2024. To bolster this hypothesis further, we saw web “secondary” languages jump correspondingly, with HTML/CSS usage up 15%, JavaScript usage up 14%, and SQL’s usage up 16%.

Python use cases stats

The biggest winner of the Python web frameworks was FastAPI, which jumped from 29% to 38% (a 30% increase). While all of the major frameworks grew year over year, FastAPI’s nearly 30% jump is impressive. I can only speculate why this is. To me, I think this jump in Python for web is likely partially due to a large number of newcomers to the Python space. Many of these are on the ML/AI/data science side of things, and those folks often don’t have years of baked-in experience and history with Flask or Django. They are likely choosing the hottest of the Python web frameworks, which today looks like it’s FastAPI. There are many examples of people hosting their ML models behind FastAPI APIs.

Python web framework usage

The trend towards async-friendly Python web frameworks has been continuing as well. Over at Talk Python, I rewrote our Python web app in async Flask (roughly 10,000 lines of Python). Django has been steadily adding async features and is nearly complete. Though today, at version 5.2, its DB layer needs a bit more work, as the team says: “We’re still working on async support for the ORM and other parts of Django.”

Python web servers shift toward async and Rust-based tools

It’s worth a brief mention that the production app servers hosting Python web apps and APIs are changing too. Anecdotally, I see two forces at play here: 1) The move to async frameworks necessitates app servers that support ASGI, not just WSGI and 2) Rust is becoming more and more central to the fast execution of Python code (we’ll dive into that shortly).

The biggest loss in this space last year was the complete demise of uWSGI. We even did a Python Bytes podcast entitled We Must Replace uWSGI With Something Else examining this situation in detail. 

We also saw Gunicorn handling less of the async workload with async-native servers such as uvicorn and Hypercorn, which are able to operate independently. Newcomer servers, based on Rust, such as Granian, have gained a solid following as well.

Rust is how we speed up Python now

Over the past couple of years, Rust has become Python’s performance co-pilot. The Python Language Summit of 2025 revealed that “Somewhere between one-quarter and one-third of all native code being uploaded to PyPI for new projects uses Rust”, indicating that “people are choosing to start new projects using Rust”.

Looking into the survey results, we see that Rust usage grew from 27% to 33% for binary extensions to Python packages.  

Languages for building binary modules for Python stats

We see this in the ecosystem with the success of Polars for data science and Pydantic for pretty much all disciplines. We are even seeing that for Python app servers such as the newer Granian.

Typed Python is getting better tooling

Another key trend this year is typed Python. You’ve probably seen Python type information in function definitions such as: 

def add(x: int, y: int) -> int: ... 

These have been in Python for a while now. Yet, there is a renewed effort to make typed Python more common and more forgiving. We’ve had tools such as mypy since typing’s early days, but the goal there was more along the lines of whole program consistency. In just the past few months, we have seen two new high-performance typing tools released:

They are both vying to be the next generation tooling for type checking. Moreover, both of these tools provide extremely fast language server protocols (LSPs).

Notice anything similar? They are both written in Rust, backing up the previous claim that “Rust has become Python’s performance co-pilot”.

By the way, I interviewed the team behind ty when it was announced a few weeks ago if you want to dive deeper into that project.

Code and docs make up most open-source contributions

There are many different and unique ways to contribute to open source. Probably the first thing that comes to most people’s minds when they think of a contributor is someone who writes code and adds a new feature to that project. However, there are less visible and important ways to make a contribution, such as triaging issues in reviewing pull requests.

So, what portion of the community has contributed to open source, and in which ways have they done so?

The survey tells us that one-third of devs contributed to open source. This manifests primarily as code and documentation/tutorial additions.

Python open source contribution stats

Docs are #1

Where do you typically learn as a developer or data scientist? Respondents said that docs are #1. There are many ways to learn languages and libraries, but people like docs best. This is good news for open-source maintainers. This means that the effort put into documentation (and embedded tutorials) is well spent. It’s a clear and straightforward way to improve users’ experience with your project.

Moreover, this lines up with Developer Trends in 2025, a podcast panel episode I did with experienced Python developers, including JetBrains’ own Paul Everitt. The panelists all agree that docs are #1, though the survey ranked YouTube much higher than the panelists, at 51%. Remember, our community has an average of 1–2 years of experience, and 45% of them are younger than 30 years old.

A powerful newcomer is the suite of AI tools we all have been hearing about (and many of us using). AI tools as a learning source increased from 19% to 27% (up 42% year over year)!

Postgres reigns as the database king for Pythonistas

When asked which database (if any) respondents chose, they overwhelmingly said PostgreSQL. PostgreSQL is the king of Python databases, and only it’s growing, going from 43% to 49%. That’s +14% year over year, which is remarkable for a 28-year-old open-source project.

Databases used by Python developers stats

One interesting detail here, beyond Postgres being used a lot, is that every single database in the top six grew in usage year over year. This is likely another indicator that web development itself is growing again, as discussed above.

Forward-looking trends

Agentic AI will be wild

My first forward-looking trend is that agentic AI will be a game-changer for coding. Agentic AI is often cited as a tool of the much maligned and loved vibe coding. However, vibe coding obscures the fact that agentic AI tools are remarkably productive when used alongside a talented engineer or data scientist.

Surveys outside the PSF survey indicate that about 70% of developers were using or planning to use AI coding tools in 2023, and by 2024, around 44% of professional developers use them daily.

JetBrains’ State of Developer Ecosystem 2023 report noted that within a couple of years, “AI-based code generation tools went from interesting research to an important part of many developers’ toolboxes”. Jump ahead to 2025, according to the State of Developer Ecosystem 2025 survey, nearly half of the respondents (49%) plan to try AI coding agents in the coming year.

AI coding agents usage stats

Program managers at major tech companies have stated that they almost cannot hire developers who don’t embrace agentic AI. The productive delta between those using it and those who avoid it is simply too great (estimated at about 30% greater productivity with AI).

Async, await, and threading are becoming core to Python

The future will be abuzz with concurrency and Python. We’ve already discussed how the Python web frameworks and app servers are all moving towards asynchronous execution, but this only represents one part of a powerful trend.

Python 3.14 will be the first version of Python to completely support free-threaded Python. Free-threaded Python, which is a version of the Python runtime that does not use the GIL, the global interpreter lock, was first added as an experiment to CPython 3.13.

Just last week, the steering council and core developers officially accepted this as a permanent part of the language and runtime. This will have far-reaching effects. Developers and data scientists will have to think more carefully about threaded code with locks, race conditions, and the performance benefits that come with it. Package maintainers, especially those with native code extensions, may have to rewrite some of their code to support free-threaded Python so they themselves do not enter race conditions and deadlocks.

There is a massive upside to this as well. I’m currently writing this on the cheapest Apple Mac Mini M4. This computer comes with 10 CPU cores. That means until this change manifests in Python, the maximum performance I can get out of a single Python process is 10% of what my machine is actually capable of. Once free-threaded Python is fully part of the ecosystem, I should get much closer to maximum capacity with a standard Python program using threading and the async and await keywords.

Async and await keywords are not just tools for web developers who want to write more concurrent code.  It’s appearing in more and more locations. One such tool that I recently came across is Temporal. This program leverages the asyncio event loop but replaces the standard clever threading tricks with durable machine-spanning execution. You might simply await some action, and behind the scenes, you get durable execution that survives machine restarts. So understanding async and await is going to be increasingly important as more tools make interesting use of it, as Temporal did.

I see parallels here of how Pydantic made a lot of people more interested in Python typing than they otherwise would have been.

Python GUIs and mobile are rising

My last forward-looking trend is that Python GUIs and Python on mobile are rising. When we think of native apps on iOS and Android, we can only dream of using Python to build them someday soon.

At the 2025 Python Language Summit, Russell Keith-Magee presented his work on making iOS and Android Tier 3-supported platforms for CPython. This has been laid out in PEP 730 and PEP 738. This is a necessary but not sufficient condition for allowing us to write true native apps that ship to the app stores using Python.

More generally, there have been some interesting ideas and new takes on UIs for Python. We had Jeremy Howard from fast.ai introduce FastHTML, which allows us to write modern web applications in pure Python. NiceGUI has been coming on strong as an excellent way to write web apps and PWAs in pure Python.

I expect these changes, especially the mobile ones, to unlock powerful use cases that we’ll be talking about for years to come.

Actionable ideas

You’ve seen the results, my interpretations, and predictions. So what should you do about them? Of course, nothing is required of you, but I am closing out this article with some actionable ideas to help you take advantage of these technological and open-source waves.

Here are six actionable ideas you can put into practice after reading this article. Pick your favorite one that you’re not yet leveraging and see if it can help you thrive further in the Python space.

Action 1: Learn uv

uv, the incredible package and Python management tool jumped incredibly from 0% to 11% the year it was introduced (and that growth has demonstrably continued to surge in 2025). This Rust-based tool unifies capabilities from many of the most important ones you may have previously heard of and does so with performance and incredible features.

Do you need Python on the machine? Simply run uv venv venv, and you have both installed the latest stable release and created a virtual environment. That’s just the beginning. If you want the full story, I did an interview with Charlie Marsh about the second generation of uv over on Talk Python.

If you decide to install uv, be sure to use their standalone installers. It allows uv to manage itself and get better over time.

Action 2: Use the latest Python

We saw that 83% of respondents are not using the latest version of Python. Don’t be one of them. Use a virtual environment or use a container and install the latest version of Python. The quickest and easiest way these days is to use uv, as it won’t affect system Python and other configurations (see action 1!).

If you deploy or develop in Docker containers, all you need to do is set up the latest version of Python 3.13 and run these two lines:

RUN curl -LsSf https://astral.sh/uv/install.sh | sh
RUN uv venv --python 3.13 /venv

If you develop locally in virtual environments (as I do), just remove the RUN keyword and use uv to create that environment. Of course, update the version number as new major versions of Python are released.

By taking this action, you will be able to take advantage of the full potential of modern Python, from the performance benefits to the language features.

Action 3: Learn agentic AI

If you’re one of the people who have not yet tried agentic AI, you owe it to yourself to give it a look. I understand why people avoid using AI and LLMs. For one thing, there’s dubious legality around copyrights. The environmental harms can be real, and the threat to developers’ jobs and autonomy is not to be overlooked. But using top-tier models for agentic AI, not just chatbots, allows you to be tremendously productive.

I’m not recommending vibe coding. But have you ever wished for a library or package to exist, or maybe a CLI tool to automate some simple part of your job? Give that task to an agentic AI, and you won’t be taking on technical debt to your main application and some part of your day. Your productivity just got way better.

The other mistake people make here is to give it a try using the cheapest free models. When they don’t work that great, people hold that up as evidence and say, “See, it’s not that helpful. It just makes up stuff and gets things wrong.” Make sure you choose the best possible model that you can, and if you want to give it a genuine look, spend $10 or $20 for a month to see what’s actually possible.

JetBrains recently released Junie, an agentic coding assistant for their IDEs. If you’re using one of them, definitely give it a look.

Action 4: Learn to read basic Rust

Python developers should consider learning the basics of Rust, not to replace Python, but to complement it. As I discussed in our analysis, Rust is becoming increasingly important in the most significant portions of the Python ecosystem. I definitely don’t recommend that you become a Rust developer instead of a Pythonista, but being able to read basic Rust so that you understand what the libraries you’re consuming are doing will be a good skill to have.

Action 5: Invest in understanding threading

Python developers have worked mainly outside the realm of threading and parallel programming. In Python 3.6, the amazing async and await keywords were added to the language. However, they only applied to I/O bound concurrency. For example, if I’m calling a web service, I might use the HTTPX library and await that call. This type of concurrency mostly avoids race conditions and that sort of thing.

Now, true parallel threading is coming for Python. With PEP 703 officially and fully accepted as part of Python in 3.14, we’ll need to understand how true threading works. This will involve understanding locks, semaphores, and mutexes.

It’s going to be a challenge, but it is also a great opportunity to dramatically increase Python’s performance.

At the 2025 Python Language Summit, almost one-third of the talks dealt with concurrency and threading in one form or another. This is certainly a forward-looking indicator of what’s to come.

Not every program you write will involve concurrency or threading, but they will be omnipresent enough that having a working understanding will be important. I have a course I wrote about async in Python if you’re interested in learning more about that format. Plus, JetBrain’s own Cheuk Ting Ho wrote an excellent article entitled Faster Python: Concurrency in async/await and threading, which is worth a read.

Action 6: Remember the newbies

My final action to you is to keep things accessible for beginners – every time you build or share. Half of the Python developer base has been using Python for less than two years, and most of them have been programming in any format for less than two years. That is still remarkable to me.

So, as you go out into the world to speak, write, or create packages, libraries, and tools, remember that you should not assume years of communal knowledge about working with multiple Python files, virtual environments, pinning dependencies, and much more.

Interested in learning more? Check out the full Python Developers Survey Results here.

Start developing with PyCharm

PyCharm provides everything you need for data science, ML/AI workflows, and web development right out of the box – all in one powerful IDE.

About the author

Michael Kennedy

Michael Kennedy

Michael is the founder of Talk Python and a PSF Fellow. Talk Python is a podcast and course platform that has been exploring the Python ecosystem for over 10 years. At his core, Michael is a web and API developer.

August 18, 2025 04:41 PM UTC


Real Python

Single and Double Underscores in Python Names

Python has a few naming conventions that are based on using either a single or double underscore character (_). These conventions allow you to differentiate between public and non-public names in APIs, write subclasses safely, prevent name collisions, and more.

Following these conventions makes your code look more Pythonic and consistent to other developers. This skill is especially helpful when you’re working on collaborative projects.

By the end of this tutorial, you’ll understand that:

  • Underscores in Python names indicate intent: a single leading underscore signals a non-public name, a single trailing underscore helps avoid naming conflicts, and a double leading underscore triggers name mangling for class attributes and methods.
  • Python doesn’t enforce public or private names with access restrictions. It relies on naming conventions, where public names have no underscores and non-public names start with a single underscore.
  • Python’s name mangling automatically renames attributes or methods with double leading underscores by prefixing them with the class name, helping you avoid accidental overrides in subclasses.
  • Double leading and trailing underscores—known as dunders—denote special methods or attributes, such as .__init__(), .__len__(), and __name__, which Python uses to support internal behaviors.

You’ll explore practical examples of these naming conventions, learn when and why to use each one, and understand their effects on code readability, API design, and inheritance.

Get Your Code: Click here to download the free sample code that shows you how to use single and double underscores in Python names.

Take the Quiz: Test your knowledge with our interactive “Single and Double Underscores in Python Names” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Single and Double Underscores in Python Names

In this quiz, you'll test your understanding of the use of single and double underscores in Python names. This knowledge will help you differentiate between public and non-public names, avoid name clashes, and write code that looks Pythonic and consistent.

Public Interfaces and Naming Conventions in Python

As a Python programmer, you’ll frequently work with public interfaces, known as application programming interfaces (APIs). An API is a type of programming interface that offers a service to other parts of a program or other programs.

For example, the Python standard library has many modules and packages that provide certain services. To use these modules and packages, you need to access their public components, such as classes, functions, variables, constants, and modules. All these objects are part of the module or package’s public interface. They’re available for you to use directly in your code.

However, many of these packages and modules define objects that aren’t intended for direct access. These objects are meant for internal use within the specific package or module and aren’t part of its public interface.

In the context of object-oriented programming, languages like C++ and Java have the notion of public and private methods and attributes—jointly called members. In these languages, you can use these types of class members as follows:

  • Public: You can use them in your own code or client code.
  • Private: You can use them only from inside the defining class and its subclasses.

These languages have specific keywords and syntax to define public and private members in their classes. Once you declare a member as private, you can’t use it outside the class because the language restricts access. So, private members aren’t part of the class’s public interface, and there’s no way to access them.

In contrast, Python doesn’t have the notion of public and private members. It has neither dedicated keywords nor syntax for defining them. Therefore, you can always access the members of a Python class.

If Python doesn’t have a specific syntax to define when an object is part of a public interface, then how do you tell your users that they can or can’t use a given class, method, function, variable, constant, or even module in their code?

To approach this question, the Python community has a well-established naming convention:

If a name starts with a letter in uppercase or lowercase, then you should consider that name public and, therefore, part of the code’s API. In contrast, if a name starts with an underscore character (_), then you should consider that name non-public, meaning it’s not a part of the public API.

You should observe these naming conventions to explicitly indicate whether other developers should directly use your variables, constants, functions, methods, and modules in external code.

Note: This naming convention doesn’t restrict access to objects. It only signals to other developers how the code is intended to be used. Because of this, Python programmers avoid the terms public and private. Instead, they distinguish between public and non-public (internal) names.

The Python community uses the underscore character (_) as part of other naming conventions. Here’s a summary of what PEP 8 says about using this character in names:

Convention Example Meaning
Single leading underscore _variable Indicates that the name is meant for internal use only
Single trailing underscore class_ Avoids naming conflicts with Python keywords and built-in names
Double leading underscore __attribute Triggers name mangling in the context of Python classes
Double leading and trailing underscore __name__ Indicates special attributes and methods that Python provides
Single underscore _ Indicates a temporary or throwaway variable

Note that only two of these naming conventions enforce specific Python behaviors. Using double leading underscores triggers name mangling in Python classes. You’ll learn more about this behavior in the section on name mangling.

Additionally, those names with double leading and trailing underscores that are listed in the Python data model trigger internal behaviors in specific contexts. You’ll also learn more about this topic in the section on dunder names in Python.

Note: Python also treats a single underscore (_) as a soft keyword within matchcase statements. You’ll learn more about this later.

Read the full article at https://realpython.com/python-double-underscore/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 18, 2025 02:00 PM UTC


PyCharm

The State of Python 2025

August 18, 2025 11:42 AM UTC


Python Bytes

#445 Auto-activate Python virtual environments for any project

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://astral.sh/blog/introducing-pyx?featured_on=pythonbytes">pyx - <em>optimized backend for uv</em></a></strong></li> <li><em>* <a href="https://www.b-list.org/weblog/2025/aug/06/litestar/?featured_on=pythonbytes">Litestar is worth a look</a></em>*</li> <li><em>* <a href="https://django-remake-migrations.readthedocs.io/en/latest/index.html?featured_on=pythonbytes">Django remake migrations</a></em>*</li> <li><em>* <a href="https://github.com/djpeacher/django-chronos?featured_on=pythonbytes">django-chronos</a></em>*</li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=hOv2AA-dcs4' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="445">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <h1>Python Bytes 445</h1> <p>Sponsored by <strong>Sentry</strong>: <a href="http://pythonbytes.fm/sentry">pythonbytes.fm/sentry</a> - Python Error and Performance Monitoring</p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky)</li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1:</strong> <a href="https://astral.sh/blog/introducing-pyx?featured_on=pythonbytes">pyx - <em>optimized backend for uv</em></a></p> <ul> <li>via John Hagen (thanks again)</li> <li>I’ll be interviewing Charlie in 9 days on Talk Python → Sign up (get notified) of the <a href="https://www.youtube.com/watch?v=YKBcgBgK7gc">livestream here</a>.</li> <li>Not a PyPI replacement, more of a middleware layer to make it better, faster, stronger.</li> <li>pyx is a paid service, with maybe a free option eventually.</li> </ul> <p><strong>Brian #2: <a href="https://www.b-list.org/weblog/2025/aug/06/litestar/?featured_on=pythonbytes">Litestar is worth a look</a></strong></p> <ul> <li>James Bennett</li> <li>Michael brought up <a href="https://litestar.dev?featured_on=pythonbytes">Litestar</a> in <a href="https://pythonbytes.fm/episodes/show/411/tls-client-hello-guitar-solo">episode 444</a> when talking about rewriting TalkPython in Quart</li> <li>James brings up <ul> <li>scaling - Litestar is easy to split an app into multiple files</li> <li>Not using pydantic - You can use pydantic with Litestar, but you don’t have to. Maybe attrs is right for you instead.</li> </ul></li> <li>Michael brought up <ul> <li>Litestar seems like a “more batteries included” option.</li> <li>Somewhere between FastAPI and Django.</li> </ul></li> </ul> <p><strong>Brian #3: <a href="https://django-remake-migrations.readthedocs.io/en/latest/index.html?featured_on=pythonbytes">Django remake migrations</a></strong></p> <ul> <li>Suggested by <a href="https://bsky.app/profile/browniebroke.com/post/3lw7cfc3p662p?featured_on=pythonbytes">Bruno Alla on BlueSky</a></li> <li>In response to a <a href="https://pythonbytes.fm/episodes/show/444/begone-python-of-yore">migrations topic last week</a></li> <li>django-remake-migrations is a tool to help you with migrations and the docs do a great job of describing the problem way better than I did last week</li> <li>“The built-in <code>squashmigrations</code> command is great, but it only work on a single app at a time, which means that you need to run it for each app in your project. On a project with enough cross-apps dependencies, it can be tricky to run.”</li> <li>“This command aims at solving this problem, by recreating all the migration files in the whole project, from scratch, and mark them as applied by using the <code>replaces</code> attribute.<strong>”</strong></li> <li>Also of note <ul> <li>The package was created with <a href="https://copier.readthedocs.io/en/stable/?featured_on=pythonbytes">Copier</a></li> <li>Michael brought up Copier in 2021 in <a href="https://pythonbytes.fm/episodes/show/219/htmx-dynamic-and-live-html-without-javascript">episode 219</a></li> <li>It has a nice <a href="https://copier.readthedocs.io/en/stable/comparisons/?featured_on=pythonbytes">comparison table with CookieCutter and Yoeman</a></li> <li>One difference from CookieCutter is yml vs json.</li> <li>I’m actually not a huge fan of handwriting either. But I guess I’d rather hand write yml.</li> <li>So I’m thinking of trying Copier with my future project template needs.</li> </ul></li> </ul> <p><strong>Michael #4: <a href="https://github.com/djpeacher/django-chronos?featured_on=pythonbytes">django-chronos</a></strong></p> <ul> <li>Django middleware that shows you how fast your pages load, right in your browser.</li> <li>Displays request timing and query counts for your views and middleware.</li> <li>Times middleware, view, and total per request (CPU and DB).</li> </ul> <p><strong>Extras</strong></p> <p><strong>Brian</strong>:</p> <ul> <li><strong>Test &amp; Code 238: So Long, and Thanks for All the Fish</strong> <ul> <li><strong>after</strong> 10 years, this is the goodbye episode</li> </ul></li> </ul> <p><strong>Michael</strong>:</p> <ul> <li>Auto-activate Python virtual environment for any project with a venv directory in your shell (macOS/Linux): <a href="https://gist.github.com/mikeckennedy/6c6fd9191879fba77a334f13aa4ad3d3?featured_on=pythonbytes">See gist</a>.</li> <li><a href="https://docs.python.org/release/3.13.6/whatsnew/changelog.html?featured_on=pythonbytes">Python 3.13.6 is out</a>.</li> <li><a href="https://openai.com/index/introducing-gpt-oss/?featured_on=pythonbytes">Open weight OpenAI models</a></li> <li><a href="https://training.talkpython.fm/courses/just-enough-python-for-data-scientists?featured_on=pythonbytes">Just Enough Python for Data Scientists Course</a></li> <li><a href="https://blog.jetbrains.com/pycharm/2025/08/the-state-of-python-2025/?featured_on=pythonbytes">The State of Python 2025 article</a> by Michael</li> </ul> <p><strong>Joke:</strong> <a href="https://x.com/PR0GRAMMERHUM0R/status/1954935921314287997?featured_on=pythonbytes">python is better than java</a></p>

August 18, 2025 08:00 AM UTC


Armin Ronacher

Your MCP Doesn’t Need 30 Tools: It Needs Code

I wrote a while back about why code performs better than MCP (Model Context Protocol) for some tasks. In particular, I pointed out that if you have command line tools available, agentic coding tools seem very happy to use those. In the meantime, I learned a few more things that put some nuance to this. There are a handful of challenges with CLI-based tools that are rather hard to resolve and require further examination.

In this blog post, I want to present the (not so novel) idea that an interesting approach is using MCP servers exposing a single tool, that accepts programming code as tool inputs.

CLI Challenges

The first and most obvious challenge with CLI tools is that they are sometimes platform-dependent, version-dependent, and at times undocumented. This has meant that I routinely encounter failures when using tools on first use.

A good example of this is when the tool usage requires non-ASCII string inputs. For instance, Sonnet and Opus are both sometimes unsure how to feed newlines or control characters via shell arguments. This is unfortunate but ironically not entirely unique to shell tools either. For instance, when you program with C and compile it, trailing newlines are needed. At times, agentic coding tools really struggle with appending an empty line to the end of a file, and you can find some quite impressive tool loops to work around this issue.

This becomes particularly frustrating when your tool is absolutely not in the training set and uses unknown syntax. In that case, getting agents to use it can become quite a frustrating experience.

Another issue is that in some agents (Claude Code in particular), there is an extra pass taking place for shell invocations: the security preflight. Before executing a tool, Claude also runs it through the fast Haiku model to determine if the tool will do something dangerous and avoid the invocation. This further slows down tool use when multiple turns are needed.

In general, doing multiple turns is very hard with CLI tools because you need to teach the agent how to manage sessions. A good example of this is when you ask it to use tmux for remote-controlling an LLDB session. It’s absolutely capable of doing it, but it can lose track of the state of its tmux session. During some tests, I ended up with it renaming the session halfway through, forgetting that it had a session (and thus not killing it).

This is particularly frustrating because the failure case can be that it starts from scratch or moves on to other tools just because it got a small detail wrong.

Composability

Unfortunately, when moving to MCP, you immediately lose the ability to compose without inference (at least today). One of the reasons lldb can be remote-controlled with tmux at all is that the agent manages to compose quite well. How does it do that? It uses basic tmux commands such as tmux send-keys to send inputs or tmux capture-pane to get the output, which don’t require a lot of extra tooling. It then chains commands like sleep and tmux capture-pane to ensure it doesn’t read output too early. Likewise, when it starts to fail with encoding more complex characters, it sometimes changes its approach and might even use base64 -d.

The command line really isn’t just one tool — it’s a series of tools that can be composed through a programming language: bash. The most interesting uses are when you ask it to write tools that it can reuse later. It will start composing large scripts out of these one-liners. All of that is hard with MCP today.

Better Approach To MCP?

It’s very clear that there are limits to what these shell tools can do. At some point, you start to fight those tools. They are in many ways only as good as their user interface, and some of these user interfaces are just inherently tricky. For instance, when evaluated, tmux performs better than GNU screen, largely because the command-line interface of tmux is better and less error-prone. But either way, it requires the agent to maintain a stateful session, and it’s not particularly good at this today.

What is stateful out of the box, however, is MCP. One surprisingly useful way of running an MCP server is to make it an MCP server with a single tool (the ubertool) which is just a Python interpreter that runs eval() with retained state. It maintains state in the background and exposes tools that the agent already knows how to use.

I did this experiment in a few ways now, the one that is public is pexpect-mcp. It’s an MCP that exposes a single tool called pexpect_tool. It is, however, in many ways a misnomer. It’s not really a pexpect tool — it’s a Python interpreter running out of a virtualenv that has pexpect installed.

What is pexpect? It is the Python port of the ancient expect command-line tool which allows one to interact with command-line programs through scripts. The documentation describes expect as a “program that ‘talks’ to other interactive programs according to a script.”

What is special about pexpect is that it’s old, has a stable API, and has been used all over the place. You could wrap expect or pexpect with lots of different MCP tools like pexpect_expect, pexpect_sendline, pexpect_spawn, and more. That’s because the pexpect.Spawn class exposes 36 different API functions! That’s a lot. But many of these cannot be used in isolation well anyway. Take this motivating example from the docs:

child = pexpect.spawn('scp foo user@example.com:.')
child.expect('Password:')
child.sendline(mypassword)

Even the most basic use here involves three chained tool calls. And that doesn’t include error handling, which one might also want to encode.

So instead, a much more interesting way to have this entire thing run is to just have the command language to the MCP be Python. The MCP server turns into a stateful Python interpreter, and the tool just lets it send Python code that is evaluated with the same state as before. There is some extra support in the MCP server to make the experience more reliable (like timeout support), but for the most part, the interface is to just send Python code. In fact, the exact script from above is what an MCP client is expected to send.

The tool description just says this:

Execute Python code in a pexpect session. Can spawn processes and interact with
them.

Args:
  `code`: Python code to execute. Use 'child' variable to interact with the
  spawned process. The pexpect library is already imported. Use
  `pexpect.spawn(...)` to spawn something. timeout: Optional timeout in seconds.
  If not provided, uses global `TIMEOUT` (default 30s).

Example:
  child = pexpect.spawn('lldb ./mytool')
  child.expect("(lldb)")

Returns:
  The result of the code execution or an error message.

This works because the interface to the MCP is now not just individual tools it has never seen — it’s a programming language that it understands very well, with additional access to an SDK (pexpect) that it has also seen and learned all the patterns from. We’re relegating the MCP to do the thing that it does really well: session management and guiding the tool through a built-in prompt.

More importantly, the code that it writes is very similar to what it might put into a reusable script. There is so little plumbing in the actual MCP that you can tell the agent after the session to write a reusable pexpect script from what it learned in the session. That works because all the commands it ran are just Python — they’re still in the context, and the lift from that to a reusable Python script is low.

Do It, Then Do It Again

Now I don’t want to bore you too much with lots of Claude output, but I took a crashing demo app that Mario wrote and asked it to debug with LLDB through pexpect_tool. Here is what that looked like:

Expand to see entire session
> Please debug ./demo-buggy with lldb

⏺ I'll help you debug the ./demo-buggy executable with lldb. Let me start by examining what we have and then launch lldb to debug it.

⏺ List(~/Development/pexpect-mcp)
  ⎿  Listed 28 paths (ctrl+r to expand)

⏺ Now let me use the pexpect tool to start an lldb debugging session with the demo-buggy executable:

⏺ pexpect - pexpect_tool (MCP)(code: "import pexpect\n\n# Spawn lldb with the demo-buggy executable\nchild = pexpect.spawn('lldb
                              ./demo-buggy')\nchild.expect(\"(lldb)\")\nprint(\"LLDB started successfully\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's run the program to see what happens\nchild.sendline('run')\nchild.expect_list([pexpect.TIMEOUT, \"(lldb)\",
                              \"Process.*exited\"], timeout=5)\nprint(\"Output after 'run' command:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Error: 'str' object has no attribute 'search'

⏺ Let me fix the pexpect usage:

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's run the program to see what happens\nchild.sendline('run')\ntry:\n    index = child.expect([pexpect.TIMEOUT,
                              pexpect.EOF, \"(lldb)\", \"Process.*exited\"], timeout=5)\n    print(f\"Matched pattern index: {index}\")\n    print(\"Output after
                              'run' command:\")\n    print(child.before.decode() + child.after.decode())\nexcept Exception as e:\n    print(f\"Exception: {e}\")\n
                                 print(\"Current buffer:\")\n    print(child.before.decode() if child.before else \"No before content\")")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's check what's in the buffer and set a breakpoint first\nchild.sendline('breakpoint set --name
                              main')\nchild.expect(\"(lldb)\")\nprint(\"Breakpoint set output:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Now run the program\nchild.sendline('run')\nchild.expect(\"(lldb)\")\nprint(\"Run output:\")\nprint(child.before.decode() +
                               child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's examine the source code first to understand what we're debugging\nchild.sendline('source
                              list')\nchild.expect(\"(lldb)\")\nprint(\"Source listing:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ Let me also look at the source code directly to understand what we're debugging:

⏺ Read(demo-buggy.c)
  ⎿  Read 77 lines (ctrl+r to expand)

⏺ Perfect! Now I can see the bugs in the code. Let me continue with the lldb debugging session to demonstrate finding these issues:

⏺ pexpect - pexpect_tool (MCP)(code: "# Continue execution to hit the crash\nchild.sendline('continue')\nchild.expect(\"(lldb)\")\nprint(\"Continue
                              output:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's examine the backtrace to see where the crash
                              occurred\nchild.sendline('bt')\nchild.expect(\"(lldb)\")\nprint(\"Backtrace:\")\nprint(child.before.decode() +
                              child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's examine the current frame and variables\nchild.sendline('frame variable')\nchild.expect(\"(lldb)\")\nprint(\"Frame
                              variables:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's look at the student structure to see the NULL pointer\nchild.sendline('p
                              *s')\nchild.expect(\"(lldb)\")\nprint(\"Student structure:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's specifically look at the scores pointer\nchild.sendline('p s->scores')\nchild.expect(\"(lldb)\")\nprint(\"Scores
                              pointer:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Let's also examine the disassembly around the crash\nchild.sendline('disassemble
                              --frame')\nchild.expect(\"(lldb)\")\nprint(\"Disassembly:\")\nprint(child.before.decode() + child.after.decode())")
  ⎿  Code executed successfully

⏺ pexpect - pexpect_tool (MCP)(code: "# Exit lldb\nchild.sendline('quit')\nchild.expect(pexpect.EOF)\nprint(\"LLDB session ended\")")
  ⎿  Error: Operation timed out after 15 seconds

⏺ pexpect - pexpect_tool (MCP)(code: "# Force quit\nchild.close()\nprint(\"LLDB session closed\")")
  ⎿  Code executed successfully

⏺ Based on the debugging session with lldb, I've identified the main issues in ./demo-buggy:

  Key Findings:

  1. Segmentation Fault at line 26 (demo-buggy.c:26): The program crashes when trying to write to s->scores[s->num_scores] because s->scores is NULL.
  2. Root Cause: In create_student() function at line 19, the scores array allocation is commented out and set to NULL instead:
  s->scores = NULL;  // This will cause a segfault when we try to write to it
  3. Additional Bug: In calculate_average() at line 34, there's an off-by-one error:
  for (int i = 0; i <= s->num_scores; i++) {  // BUG: <= should be <

  Debug Session Summary:

  - The program crashes immediately when add_score(alice, 95.5) is called
  - The backtrace shows the crash occurs in the add_score function
  - The s->scores pointer is NULL, causing the segfault when dereferencing
  - The crash happens before reaching the second bug in calculate_average()

  The lldb session successfully identified the exact location and cause of the crash, demonstrating how to use the debugger to trace segmentation faults back to
  their source.

Afterwards I asked it to dump it into a reusable Python script to be run later:

The prompt that wrote the script
> now that you wrote this pexpect script, could you dump out a python script
  file with those pscript commands so we can use it again later?

⏺ Write(debug_demo.py)
  ⎿  Wrote 95 lines to debug_demo.py
     #!/usr/bin/env python3
     """
     Debug script for demo-buggy using pexpect and lldb.
     This script automates the debugging session to identify the segfault.
     """
     import pexpect
     import sys
     def debug_demo_buggy():
     … +85 lines (ctrl+r to expand)

⏺ Created debug_demo.py - you can now run python debug_demo.py to
   replay the entire debugging session automatically.

And from a fresh session we can ask it to execute it once more:

> please run debug_demo.py with uv run to figure out what's going on

⏺ Bash(uv run debug_demo.py)
  ⎿  Starting lldb debugging session for ./demo-buggy
     ✓ LLDB started successfully
     … +155 lines (ctrl+r to expand)

⏺ The debug script successfully ran and identified the issue! Here's what it found:

  Main Problem: The demo-buggy executable crashes with a segfault in the add_score()
  function because s->scores is NULL (not allocated memory).

  Root Cause: Line 18 in create_student() is commented out, which means the scores
  array is never allocated.

  Fix: Uncomment line 18 in the create_student() function to allocate memory for the
  scores array.

  Additional Issue: There's also an off-by-one error in the calculate_average()
  function.

  The debugging session shows that:
  1. LLDB successfully launched and set a breakpoint at main
  2. The program stopped at the breakpoint
  3. It identified that alice and bob Student pointers were created
  4. The crash occurs when trying to access unallocated memory for scores

  This demonstrates that the pexpect MCP tool is working correctly for debugging C
  programs with LLDB!

That again works because the code it writes into the MCP is very close to the code that it would write into a Python script. And the difference is meaningful. The initial debug takes about 45 seconds on my machine and uses about 7 tool calls. The re-run with the dumped playbook takes one tool call and finishes in less than 5 seconds. Most importantly: that script is standalone. I can run it as a human, even without the MCP!

Novel Things

Now the above example works beautifully because these models just know so much about pexpect. That’s hardly surprising in a way. So how well does this work when the code that it should write is entirely unknown to it? Well, not quite as well. However, and this is the key part, because the meta input language is Python, it means that the total surface area that can be exposed from an ubertool is pretty impressive.

A general challenge with MCP today is that the more tools you have, the more you’re contributing to context rot. You’re also limited to rather low amounts of input. On the other hand, if you have an MCP that exposes a programming language, it also indirectly exposes a lot of functionality that it knows from its training.

For instance, one of the really neat parts about this is that it knows dir(), globals(), repr(), and other stuff. Heck, it even knows about sys._getframe(). This means that you can give it very rudimentary instructions about how its sandbox operates and what it might want to do to learn more about what is available to it as needed. You can also tell it in the prompt that there is a function it can run to learn more about what’s available when it needs help!

So when you build something that is completely novel, at least the programming language is known. You can, for instance, write a tiny MCP that dumps out the internal state of your application, provides basic query helpers for your database that support your sharding setup, or provides data reading APIs. It will discover all of this anyway from reading the code, but now it can also use a stateful Python or JavaScript session to run these tools and explore more.

This is also a fun feature when you want to ask the agent to debug the MCP itself. Because Python and JavaScript are so powerful, you can, for instance, also ask it to debug the MCP’s state itself when something went wrong.

Security Is A Sham

The elephant in the room for all things agentic coding is security. Claude mostly doesn’t delete your machine and maybe part of that is the Haiku preflight security check. But isn’t all of this a sham anyway? I generally love to watch how Claude and other agents maneuver their way around protections in pretty creative ways. Clearly it’s potent and prompt-injectable. By building an MCP that just runs eval(), we might be getting rid of some of the remaining safety here.

But does it matter? We are seemingly okay with it writing code and running tests, which is the same kind of bad as running eval(). I’m sure the day of reckoning will come for all of us, but right now we’re living in this world where protections don’t matter and we can explore what these things can do.

I’m honestly not sure how to best protect these things. They are pretty special in that they are just inherently unsafe and impossible to secure. Maybe the way to really protect them would be to intercept every system call and have some sort of policy framework/sandbox around the whole thing. But even in that case, what prevents an ever more clever LLM from circumventing all these things? It has internet access, it can be prompt-injected, and all interfaces we have for them are just too low-level to support protection well.

So to some degree, I think the tail risks of code execution are here to stay. But I would argue that they are not dramatically worse when the MCP executes Python code. In this particular case, consider that pexpect itself runs programs. There is little point in securing the MCP if what the MCP can run is any bash command.

Taking It Further

As interesting as the pexpect case is, that was not my original motivation. What I started to look into is replacing Playwright’s MCP with an MCP that just exposes the Playwright API via JavaScript. This is an experiment I have been running for a while, and the results are somewhat promising but also not promising enough yet. If you want to play with it, the MCP is called “playwrightess” is pretty simple. It just lets it execute JavaScript code against a sync playwright client. Same idea. Here, the tool usage is particularly nice because it gets down from ~30 tool definitions to 1:

Playwrightess Session Output
> navigate to localhost:5001 and summarize all the article links you see

⏺ I'll navigate to localhost:5001 and summarize the article links I find there.

  playwrightess-mcp - playwright_eval (MCP)(code: "await page.goto('http://localhost:5001')")
  ⎿  {
       "success": true,
       "result": "undefined",
     … +5 lines (ctrl+r to expand)

  playwrightess-mcp - playwright_eval (MCP)(code: "const articleLinks = await page.$$eval('a[href*=\"article\"],
                                           a[href*=\"/post\"], a[href*=\"/blog\"], article a, .article a, .post a, .blog a',
                                            links => links.map(link => ({ text: link.textContent?.trim(), href: link.href
                                           }))).catch(() => []); if (articleLinks.length === 0) { const allLinks = await
                                           page.$$eval('a', links => links.map(link => ({ text: link.textContent?.trim(),
                                           href: link.href })).filter(link => link.text && link.text.length > 5));
                                           console.log('All links found:', allLinks); return allLinks; } return
                                           articleLinks;")
  ⎿  {
       "success": true,
       "result": [
     … +101 lines (ctrl+r to expand)

⏺ Here are the article links from localhost:5001:

  Recent Blog Posts:
  - Your MCP Doesn't Need 30 Tools: It Needs Code (2025/8/18) - /2025/8/18/code-mcps/
  - In Support Of Shitty Types (2025/8/4) - /2025/8/4/shitty-types/
  - Agentic Coding Things That Didn't Work (2025/7/30) - /2025/7/30/things-that-didnt-work/
  - From Async/Await to Virtual Threads (2025/7/26) - /2025/7/26/virtual-threads/
  - Welcoming The Next Generation of Programmers (2025/7/20) - /2025/7/20/the-next-generation/
  - Tools: Code Is All You Need (2025/7/3) - /2025/7/3/tools/
  - My First Open Source AI Generated Library (2025/6/21) - /2025/6/21/my-first-ai-library/
  - We Can Just Measure Things (2025/6/17) - /2025/6/17/measuring/
  - Agentic Coding Recommendations (2025/6/12) - /2025/6/12/agentic-coding/
  - GenAI Criticism and Moral Quandaries (2025/6/10) - /2025/6/10/genai-criticism/

  The site appears to be Armin Ronacher's blog focused on programming topics, particularly around AI/ML tools, coding
  practices, and software development. There's also a "Next »" link indicating more articles on page 2.

The other thing that is just much nicer about this approach is how many more ways it has to funnel data out. For instance console.log from both the browser as well as the playwright script are forwarded back to the agent automatically. There is no need for the agent to ask for that information, it comes automatically. It also has a state variable that it can use to accumulate extra information between calls which it liberally uses if you for instance ask it to collect data from multiple pages in a pagination. It can do that without any further inference, because the loop happens within JavaScript.

Same with pexpect — you can easily get it to dump out a script for later that circumvents a lot of MCP calls with something it already saw. Particularly when you are debugging a gnarly issue and you need to restart the debugging more than once, that shows some promise. Does it perform better than Playwright MCP? Not in the current form, but I want to see if this idea can be taken further. It is quite verbose in the scripts that it writes, and it is not really well tuned between screenshots and text extraction.

August 18, 2025 12:00 AM UTC

August 17, 2025


Seth Michael Larson

How many RSS subscribers do I have?

RSS is super rad way to consume internet content (“like a newspaper”). This blog gets syndicated via RSS and an email newsletter. Unlike with my newsletter, it's not clear how many people are reading my blog using RSS compared to my newsletter. That's a good thing, privacy is important and I don't need to know who you are to enjoy my blog :)

But what if I was interested in a rough number of subscribers to the RSS feed? Turns out RSS feed scrapers sometimes include the number of subscribers in their User-Agent HTTP header. Like this:

User-Agent: Feedly/1.0 (poller; 131 subscribers;)

Multiple RSS reader scrapers do this, including Inoreader, Feedly, Feedbin, Newsblur, Old Reader, and a few more. So if I download the access logs for my RSS feed URLs I can approximate the number of readers using this Python script:

import re

# Assuming you can parse a list of 'User-Agents'
# from your logs: put them here.
user_agents = [...]

subs_re = re.compile(r"([0-9]+)\s+subscribers?")
subs_per_feed = {}
for user_agent in user_agents:
    # Count subscribers and deduplicate feeds
    # by the remaining text in 'User-Agent':
    subs = int(subs_re.search(user_agent).group(1))
    feed_id = re.sub(subs_re, "", user_agent)
    subs_per_feed[feed_id] = max(subs_per_feed.get(feed_id, 0), subs)

print(subs_per_feed)
print(sum(subs_per_feed.values()))

For my own RSS feed this gives a value of 257 known RSS subscribers.
Thanks for reading via RSS! 👋



Thanks for keeping RSS alive! ♥

August 17, 2025 12:00 AM UTC

August 16, 2025


Made With Mu

RIP Mu

Late last year we announced we’d retire Mu. The core maintainers have all moved onto other things, our lives have changed and the time we have available to maintain Mu has significantly decreased. Perhaps most of all, the world has moved on: when Mu started we were unique in the coding ecosystem. Now there are plenty of development environments focused on beginners.

We also promised we’d try to cut a final release.

Sadly, we’ve collectively decided we will not be able to do this.

Why?

Well, the cost of hosting websites (mostly the domain registration fees), the price of digital certificates for signing the installers, the annual fee to register for the privilege of participating on a platform (we’re looking at you Apple) and the time needed to investigate, refine and update code to work with the latest versions of other projects in the Python ecosystem are all inordinately expensive in terms of time and money. Were I (Nicholas) to pay all the financial burdens mentioned above, I estimate I’d have to pay around £1000. The cost in personal free time (that none of us have) for the development work is significant since this is deeply technical stuff we shoulder so you, the end user, don’t have to.

Yes, Mu is free software. No, Mu is not free software.

Let’s just say it’s complicated, shall we..? ;-)

Therefore the core maintainers have come to the decision to gently step away from Mu with immediate effect.

What happens next?

That’s it!

Wishing you all feelings of fulfilment as you flourish through your journey in code. We, the Mu core developers, sincerely hope you use your technical skills for fun things that enlarge our world in a humane, compassionate and thoughtful way.

Peace,

Carlos, Tiago, Tim, Vasco and Nicholas.

(The Mu core developers.)

August 16, 2025 05:00 PM UTC

August 15, 2025


Test and Code

238: So Long, and Thanks for All the Fish

A farewell to a fun 10 years.

Also, I should have tested it better. :)
In the audio I got the numbers wrong.  Doh!
This is episode 238, not 237. Oh well.

I'll still be around, of course, at:

Thanks for all the fun over the last 10 years.
I wish you the best.


Keep in touch


★ Support this podcast on Patreon ★ <p>A farewell to a fun 10 years.</p><p>Also, I should have tested it better. :)<br>In the audio I got the numbers wrong.  Doh!<br>This is episode 238, not 237. Oh well.</p><p>I'll still be around, of course, at:</p><ul><li><a href="https://pythontest.com">pythontest.com</a> - where I write about developing software with tests</li><li><a href="https://pythonbytes.fm">pythonbytes.fm </a>- Python news and headlines, delivered directly to your earbuds</li></ul><p>Thanks for all the fun over the last 10 years.<br>I wish you the best.</p> <br><p><strong>Keep in touch</strong></p><ul><li>Sign up for the <a href="https://pythontest.com/newsletter/">PythonTest Newsletter</a></li><li>Join the <a href="https://courses.pythontest.com/python-test-community">PythonTest Community</a></li><li>Or learn pytest with a <a href="https://courses.pythontest.com/">pytest course</a> or the purple rocket <a href="https://pythontest.com/pytest-book/">pytest book</a></li></ul><p><br></p> <strong> <a href="https://www.patreon.com/c/testpodcast" rel="payment" title="★ Support this podcast on Patreon ★">★ Support this podcast on Patreon ★</a> </strong>

August 15, 2025 08:20 PM UTC


John Cook

Converting very long strings to integers in Python

In the process of writing the previous post, I wanted to confirm that the number in the post

really is prime. This was useful in debugging my manual conversion of the image to text: errors did not result in a prime number. For example, I didn’t see the 9’s in the image at first, and I didn’t get a prime number until I changed four of the 8’s to 9’s.

But there was a problem along the way. Simply converting the string to an integer didn’t work. It produced the following error:

SyntaxError: Exceeds the limit (4300) for integer string conversion: value has 5382 digits; use sys.set_int_max_str_digits() to increase the limit – Consider hexadecimal for huge integer literals to avoid decimal conversion limits.

Note that the limitation is not on the size of a Python integer. The only limitation on the size of an integer is available memory. But there is a limitation on the size of string that can be converted to an integer.

The fix suggested in the error message didn’t work. But storing the number as several strings, i.e. each row of the image, and doing my own radix conversion did work.

from sympy import isprime

flaglines = [
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188988111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118898811188888111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111889881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118898811188888111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
]

m = int(flaglines[0])
for i in range(1, len(flaglines)):
    line = flaglines[i]
    x = int(line)
    m = m * 10**len(line) + x
print(isprime(n))
The post Converting very long strings to integers in Python first appeared on John D. Cook.

August 15, 2025 06:45 PM UTC


Real Python

The Real Python Podcast – Episode #261: Selecting Inheritance or Composition in Python

When considering an object-oriented programming problem, should you prefer inheritance or composition? Why wouldn't it just be simpler to use functions? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 15, 2025 12:00 PM UTC


Glyph Lefkowitz

The Futzing Fraction

The most optimistic vision of generative AI1 is that it will relieve us of the tedious, repetitive elements of knowledge work so that we can get to work on the really interesting problems that such tedium stands in the way of. Even if you fully believe in this vision, it’s hard to deny that today, some tedium is associated with the process of using generative AI itself.

Generative AI also isn’t free, and so, as responsible consumers, we need to ask: is it worth it? What’s the ROI of genAI, and how can we tell? In this post, I’d like to explore a logical framework for evaluating genAI expenditures, to determine if your organization is getting its money’s worth.

Perpetually Proffering Permuted Prompts

I think most LLM users would agree with me that a typical workflow with an LLM rarely involves prompting it only one time and getting a perfectly useful answer that solves the whole problem.

Generative AI best practices, even from the most optimistic vendors all suggest that you should continuously evaluate everything. ChatGPT, which is really the only genAI product with significantly scaled adoption, still says at the bottom of every interaction:

ChatGPT can make mistakes. Check important info.

If we have to “check important info” on every interaction, it stands to reason that even if we think it’s useful, some of those checks will find an error. Again, if we think it’s useful, presumably the next thing to do is to perturb our prompt somehow, and issue it again, in the hopes that the next invocation will, by dint of either:

  1. better luck this time with the stochastic aspect of the inference process,
  2. enhanced application of our skill to engineer a better prompt based on the deficiencies of the current inference, or
  3. better performance of the model by populating additional context in subsequent chained prompts.

Unfortunately, given the relative lack of reliable methods to re-generate the prompt and receive a better answer2, checking the output and re-prompting the model can feel like just kinda futzing around with it. You try, you get a wrong answer, you try a few more times, eventually you get the right answer that you wanted in the first place. It’s a somewhat unsatisfying process, but if you get the right answer eventually, it does feel like progress, and you didn’t need to use up another human’s time.

In fact, the hottest buzzword of the last hype cycle is “agentic”. While I have my own feelings about this particular word3, its current practical definition is “a generative AI system which automates the process of re-prompting itself, by having a deterministic program evaluate its outputs for correctness”.

A better term for an “agentic” system would be a “self-futzing system”.

However, the ability to automate some level of checking and re-prompting does not mean that you can fully delegate tasks to an agentic tool, either. It is, plainly put, not safe. If you leave the AI on its own, you will get terrible results that will at best make for a funny story45 and at worst might end up causing serious damage67.

Taken together, this all means that for any consequential task that you want to accomplish with genAI, you need an expert human in the loop. The human must be capable of independently doing the job that the genAI system is being asked to accomplish.

When the genAI guesses correctly and produces usable output, some of the human’s time will be saved. When the genAI guesses wrong and produces hallucinatory gibberish or even “correct” output that nevertheless fails to account for some unstated but necessary property such as security or scale, some of the human’s time will be wasted evaluating it and re-trying it.

Income from Investment in Inference

Let’s evaluate an abstract, hypothetical genAI system that can automate some work for our organization. To avoid implicating any specific vendor, let’s call the system “Mallory”.

Is Mallory worth the money? How can we know?

Logically, there are only two outcomes that might result from using Mallory to do our work.

  1. We prompt Mallory to do some work; we check its work, it is correct, and some time is saved.
  2. We prompt Mallory to do some work; we check its work, it fails, and we futz around with the result; this time is wasted.

As a logical framework, this makes sense, but ROI is an arithmetical concept, not a logical one. So let’s translate this into some terms.

In order to evaluate Mallory, let’s define the Futzing Fraction, “ FF ”, in terms of the following variables:

H

the average amount of time a Human worker would take to do a task, unaided by Mallory

I

the amount of time that Mallory takes to run one Inference8

C

the amount of time that a human has to spend Checking Mallory’s output for each inference

P

the Probability that Mallory will produce a correct inference for each prompt

W

the average amount of time that it takes for a human to Write one prompt for Mallory

E

since we are normalizing everything to time, rather than money, we do also have to account for the dollar of Mallory as as a product, so we will include the Equivalent amount of human time we could purchase for the marginal cost of one9 inference.

As in last week’s example of simple ROI arithmetic, we will put our costs in the numerator, and our benefits in the denominator.

FF = W+I+C+E P H

The idea here is that for each prompt, the minimum amount of time-equivalent cost possible is W+I+C+E. The user must, at least once, write a prompt, wait for inference to run, then check the output; and, of course, pay any costs to Mallory’s vendor.

If the probability of a correct answer is P=13, then they will do this entire process 3 times10, so we put P in the denominator. Finally, we divide everything by H, because we are trying to determine if we are actually saving any time or money, versus just letting our existing human, who has to be driving this process anyway, do the whole thing.

If the Futzing Fraction evaluates to a number greater than 1, as previously discussed, you are a bozo; you’re spending more time futzing with Mallory than getting value out of it.

Figuring out the Fraction is Frustrating

In order to even evaluate the value of the Futzing Fraction though, you have to have a sound method to even get a vague sense of all the terms.

If you are a business leader, a lot of this is relatively easy to measure. You vaguely know what H is, because you know what your payroll costs, and similarly, you can figure out E with some pretty trivial arithmetic based on Mallory’s pricing table. There are endless YouTube channels, spec sheets and benchmarks to give you I. W is probably going to be so small compared to H that it hardly merits consideration11.

But, are you measuring C? If your employees are not checking the outputs of the AI, you’re on a path to catastrophe that no ROI calculation can capture, so it had better be greater than zero.

Are you measuring P? How often does the AI get it right on the first try?

Challenges to Computing Checking Costs

In the fraction defined above, the term C is going to be large. Larger than you think.

Measuring P and C with a high degree of precision is probably going to be very hard; possibly unreasonably so, or too expensive12 to bother with in practice. So you will undoubtedly need to work with estimates and proxy metrics. But you have to be aware that this is a problem domain where your normal method of estimating is going to be extremely vulnerable to inherent cognitive bias, and find ways to measure.

Margins, Money, and Metacognition

First let’s discuss cognitive and metacognitive bias.

My favorite cognitive bias is the availability heuristic and a close second is its cousin salience bias. Humans are empirically predisposed towards noticing and remembering things that are more striking, and to overestimate their frequency.

If you are estimating the variables above based on the vibe that you’re getting from the experience of using an LLM, you may be overestimating its utility.

Consider a slot machine.

If you put a dollar in to a slot machine, and you lose that dollar, this is an unremarkable event. Expected, even. It doesn’t seem interesting. You can repeat this over and over again, a thousand times, and each time it will seem equally unremarkable. If you do it a thousand times, you will probably get gradually more anxious as your sense of your dwindling bank account becomes slowly more salient, but losing one more dollar still seems unremarkable.

If you put a dollar in a slot machine and it gives you a thousand dollars, that will probably seem pretty cool. Interesting. Memorable. You might tell a story about this happening, but you definitely wouldn’t really remember any particular time you lost one dollar.

Luckily, when you arrive at a casino with slot machines, you probably know well enough to set a hard budget in the form of some amount of physical currency you will have available to you. The odds are against you, you’ll probably lose it all, but any responsible gambler will have an immediate, physical representation of their balance in front of them, so when they have lost it all, they can see that their hands are empty, and can try to resist the “just one more pull” temptation, after hitting that limit.

Now, consider Mallory.

If you put ten minutes into writing a prompt, and Mallory gives a completely off-the-rails, useless answer, and you lose ten minutes, well, that’s just what using a computer is like sometimes. Mallory malfunctioned, or hallucinated, but it does that sometimes, everybody knows that. You only wasted ten minutes. It’s fine. Not a big deal. Let’s try it a few more times. Just ten more minutes. It’ll probably work this time.

If you put ten minutes into writing a prompt, and it completes a task that would have otherwise taken you 4 hours, that feels amazing. Like the computer is magic! An absolute endorphin rush.

Very memorable. When it happens, it feels like P=1.

But... did you have a time budget before you started? Did you have a specified N such that “I will give up on Mallory as soon as I have spent N minutes attempting to solve this problem with it”? When the jackpot finally pays out that 4 hours, did you notice that you put 6 hours worth of 10-minute prompt coins into it in?

If you are attempting to use the same sort of heuristic intuition that probably works pretty well for other business leadership decisions, Mallory’s slot-machine chat-prompt user interface is practically designed to subvert those sensibilities. Most business activities do not have nearly such an emotionally variable, intermittent reward schedule. They’re not going to trick you with this sort of cognitive illusion.

Thus far we have been talking about cognitive bias, but there is a metacognitive bias at play too: while Dunning-Kruger, everybody’s favorite metacognitive bias does have some problems with it, the main underlying metacognitive bias is that we tend to believe our own thoughts and perceptions, and it requires active effort to distance ourselves from them, even if we know they might be wrong.

This means you must assume any intuitive estimate of C is going to be biased low; similarly P is going to be biased high. You will forget the time you spent checking, and you will underestimate the number of times you had to re-check.

To avoid this, you will need to decide on a Ulysses pact to provide some inputs to a calculation for these factors that you will not be able to able to fudge if they seem wrong to you.

Problematically Plausible Presentation

Another nasty little cognitive-bias landmine for you to watch out for is the authority bias, for two reasons:

  1. People will tend to see Mallory as an unbiased, external authority, and thereby see it as more of an authority than a similarly-situated human13.
  2. Being an LLM, Mallory will be overconfident in its answers14.

The nature of LLM training is also such that commonly co-occurring tokens in the training corpus produce higher likelihood of co-occurring in the output; they’re just going to be closer together in the vector-space of the weights; that’s, like, what training a model is, establishing those relationships.

If you’ve ever used an heuristic to informally evaluate someone’s credibility by listening for industry-specific shibboleths or ways of describing a particular issue, that skill is now useless. Having ingested every industry’s expert literature, commonly-occurring phrases will always be present in Mallory’s output. Mallory will usually sound like an expert, but then make mistakes at random.15.

While you might intuitively estimate C by thinking “well, if I asked a person, how could I check that they were correct, and how long would that take?” that estimate will be extremely optimistic, because the heuristic techniques you would use to quickly evaluate incorrect information from other humans will fail with Mallory. You need to go all the way back to primary sources and actually fully verify the output every time, or you will likely fall into one of these traps.

Mallory Mangling Mentorship

So far, I’ve been describing the effect Mallory will have in the context of an individual attempting to get some work done. If we are considering organization-wide adoption of Mallory, however, we must also consider the impact on team dynamics. There are a number of possible potential side effects that one might consider when looking at, but here I will focus on just one that I have observed.

I have a cohort of friends in the software industry, most of whom are individual contributors. I’m a programmer who likes programming, so are most of my friends, and we are also (sigh), charitably, pretty solidly middle-aged at this point, so we tend to have a lot of experience.

As such, we are often the folks that the team — or, in my case, the community — goes to when less-experienced folks need answers.

On its own, this is actually pretty great. Answering questions from more junior folks is one of the best parts of a software development job. It’s an opportunity to be helpful, mostly just by knowing a thing we already knew. And it’s an opportunity to help someone else improve their own agency by giving them knowledge that they can use in the future.

However, generative AI throws a bit of a wrench into the mix.

Let’s imagine a scenario where we have 2 developers: Alice, a staff engineer who has a good understanding of the system being built, and Bob, a relatively junior engineer who is still onboarding.

The traditional interaction between Alice and Bob, when Bob has a question, goes like this:

  1. Bob gets confused about something in the system being developed, because Bob’s understanding of the system is incorrect.
  2. Bob formulates a question based on this confusion.
  3. Bob asks Alice that question.
  4. Alice knows the system, so she gives an answer which accurately reflects the state of the system to Bob.
  5. Bob’s understanding of the system improves, and thus he will have fewer and better-informed questions going forward.

You can imagine how repeating this simple 5-step process will eventually transform Bob into a senior developer, and then he can start answering questions on his own. Making sufficient time for regularly iterating this loop is the heart of any good mentorship process.

Now, though, with Mallory in the mix, the process now has a new decision point, changing it from a linear sequence to a flow chart.

We begin the same way, with steps 1 and 2. Bob’s confused, Bob formulates a question, but then:

  1. Bob asks Mallory that question.

Here, our path then diverges into a “happy” path, a “meh” path, and a “sad” path.

The “happy” path proceeds like so:

  1. Mallory happens to formulate a correct answer.
  2. Bob’s understanding of the system improves, and thus he will have fewer and better-informed questions going forward.

Great. Problem solved. We just saved some of Alice’s time. But as we learned earlier,

Mallory can make mistakes. When that happens, we will need to check important info. So let’s get checking:

  1. Mallory happens to formulate an incorrect answer.
  2. Bob investigates this answer.
  3. Bob realizes that this answer is incorrect because it is inconsistent with some of his prior, correct knowledge of the system, or his investigation.
  4. Bob asks Alice the same question; GOTO traditional interaction step 4.

On this path, Bob spent a while futzing around with Mallory, to no particular benefit. This wastes some of Bob’s time, but then again, Bob could have ended up on the happy path, so perhaps it was worth the risk; at least Bob wasn’t wasting any of Alice’s much more valuable time in the process.16

Notice that beginning at the start of step 4, we must begin allocating all of Bob’s time to C, so C already starts getting a bit bigger than if it were just Bob checking Mallory’s output specifically on tasks that Bob is doing.

That brings us to the “sad” path.

  1. Mallory happens to formulate an incorrect answer.
  2. Bob investigates this answer.
  3. Bob does not realize that this answer is incorrect because he is unable to recognize any inconsistencies with his existing, incomplete knowledge of the system.
  4. Bob integrates Mallory’s incorrect information of the system into his mental model.
  5. Bob proceeds to make a larger and larger mess of his work, based on an incorrect mental model.
  6. Eventually, Bob asks Alice a new, worse question, based on this incorrect understanding.
  7. Sadly we cannot return to the happy path at this point, because now Alice must unravel the complex series of confusing misunderstandings that Mallory has unfortunately conveyed to Bob at this point. In the really sad case, Bob actually doesn’t believe Alice for a while, because Mallory seems unbiased17, and Alice has to waste even more time convincing Bob before she can simply explain to him.

Now, we have wasted some of Bob’s time, and some of Alice’s time. Everything from step 5-10 is C, and as soon as Alice gets involved, we are now adding to C at double real-time. If more team members are pulled in to the investigation, you are now multiplying C by the number of investigators, potentially running at triple or quadruple real time.

But That’s Not All

Here I’ve presented a brief selection reasons why C will be both large, and larger than you expect. To review:

  1. Gambling-style mechanics of the user interface will interfere with your own self-monitoring and developing a good estimate.
  2. You can’t use human heuristics for quickly spotting bad answers.
  3. Wrong answers given to junior people who can’t evaluate them will waste more time from your more senior employees.

But this is a small selection of ways that Mallory’s output can cost you money and time. It’s harder to simplistically model second-order effects like this, but there’s also a broad range of possibilities for ways that, rather than simply checking and catching errors, an error slips through and starts doing damage. Or ways in which the output isn’t exactly wrong, but still sub-optimal in ways which can be difficult to notice in the short term.

For example, you might successfully vibe-code your way to launch a series of applications, successfully “checking” the output along the way, but then discover that the resulting code is unmaintainable garbage that prevents future feature delivery, and needs to be re-written18. But this kind of intellectual debt isn’t even specific to technical debt while coding; it can even affect such apparently genAI-amenable fields as LinkedIn content marketing19.

Problems with the Prediction of P

C isn’t the only challenging term though. P, is just as, if not more important, and just as hard to measure.

LLM marketing materials love to phrase their accuracy in terms of a percentage. Accuracy claims for LLMs in general tend to hover around 70%20. But these scores vary per field, and when you aggregate them across multiple topic areas, they start to trend down. This is exactly why “agentic” approaches for more immediately-verifiable LLM outputs (with checks like “did the code work”) got popular in the first place: you need to try more than once.

Independently measured claims about accuracy tend to be quite a bit lower21. The field of AI benchmarks is exploding, but it probably goes without saying that LLM vendors game those benchmarks22, because of course every incentive would encourage them to do that. Regardless of what their arbitrary scoring on some benchmark might say, all that matters to your business is whether it is accurate for the problems you are solving, for the way that you use it. Which is not necessarily going to correspond to any benchmark. You will need to measure it for yourself.

With that goal in mind, our formulation of P must be a somewhat harsher standard than “accuracy”. It’s not merely “was the factual information contained in any generated output accurate”, but, “is the output good enough that some given real knowledge-work task is done and the human does not need to issue another prompt”?

Surprisingly Small Space for Slip-Ups

The problem with reporting these things as percentages at all, however, is that our actual definition for P is 1attempts, where attempts for any given attempt, at least, must be an integer greater than or equal to 1.

Taken in aggregate, if we succeed on the first prompt more often than not, we could end up with a P>12, but combined with the previous observation that you almost always have to prompt it more than once, the practical reality is that P will start at 50% and go down from there.

If we plug in some numbers, trying to be as extremely optimistic as we can, and say that we have a uniform stream of tasks, every one of which can be addressed by Mallory, every one of which:

Thought experiments are a dicey basis for reasoning in the face of disagreements, so I have tried to formulate something here that is absolutely, comically, over-the-top stacked in favor of the AI optimist here.

Would that be a profitable? It sure seems like it, given that we are trading off 45 minutes of human time for 1 minute of Mallory-time and 10 minutes of human time. If we ask Python:

1
2
3
4
5
>>> def FF(H, I, C, P, W, E):
...     return (W + I + C + E) / (P * H)
... FF(H=45.0, I=1.0, C=5.0, P=1/2, W=5.0, E=0.01)
...
0.48933333333333334

We get a futzing fraction of about 0.4896. Not bad! Sounds like, at least under these conditions, it would indeed be cost-effective to deploy Mallory. But… realistically, do you reliably get useful, done-with-the-task quality output on the second prompt? Let’s bump up the denominator on P just a little bit there, and see how we fare:

1
2
>>> FF(H=45.0, I=1.0, C=5.0, P=1/3, W=5.0, E=0.01)
0.734

Oof. Still cost-effective at 0.734, but not quite as good. Where do we cap out, exactly?

1
2
3
4
5
6
7
8
9
>>> from itertools import count
... for A in count(start=4):
...     print(A, result := FF(H=45.0, I=1.0, C=5.0, P=1 / A, W=5.0, E=1/60.))
...     if result > 1:
...         break
...
4 0.9792592592592594
5 1.224074074074074
>>>

With this little test, we can see that at our next iteration we are already at 0.9792, and by 5 tries per prompt, even in this absolute fever-dream of an over-optimistic scenario, with a futzing fraction of 1.2240, Mallory is now a net detriment to our bottom line.

Harm to the Humans

We are treating H as functionally constant so far, an average around some hypothetical Gaussian distribution, but the distribution itself can also change over time.

Formally speaking, an increase to H would be good for our fraction. Maybe it would even be a good thing; it could mean we’re taking on harder and harder tasks due to the superpowers that Mallory has given us.

But an observed increase to H would probably not be good. An increase could also mean your humans are getting worse at solving problems, because using Mallory has atrophied their skills23 and sabotaged learning opportunities2425. It could also go up because your senior, experienced people now hate their jobs26.

For some more vulnerable folks, Mallory might just take a shortcut to all these complex interactions and drive them completely insane27 directly. Employees experiencing an intense psychotic episode are famously less productive than those who are not.

This could all be very bad, if our futzing fraction eventually does head north of 1 and you need to reconsider introducing human-only workflows, without Mallory.

Abridging the Artificial Arithmetic (Alliteratively)

To reiterate, I have proposed this fraction:

FF = W+I+C+E P H

which shows us positive ROI when FF is less than 1, and negative ROI when it is more than 1.

This model is heavily simplified. A comprehensive measurement program that tests the efficacy of any technology, let alone one as complex and rapidly changing as LLMs, is more complex than could be captured in a single blog post.

Real-world work might be insufficiently uniform to fit into a closed-form solution like this. Perhaps an iterated simulation with variables based on the range of values seem from your team’s metrics would give better results.

However, in this post, I want to illustrate that if you are going to try to evaluate an LLM-based tool, you need to at least include some representation of each of these terms somewhere. They are all fundamental to the way the technology works, and if you’re not measuring them somehow, then you are flying blind into the genAI storm.

I also hope to show that a lot of existing assumptions about how benefits might be demonstrated, for example with user surveys about general impressions, or by evaluating artificial benchmark scores, are deeply flawed.

Even making what I consider to be wildly, unrealistically optimistic assumptions about these measurements, I hope I’ve shown:

  1. in the numerator, C might be a lot higher than you expect,
  2. in the denominator, P might be a lot lower than you expect,
  3. repeated use of an LLM might make H go up, but despite the fact that it's in the denominator, that will ultimately be quite bad for your business.

Personally, I don’t have all that many concerns about E and I. E is still seeing significant loss-leader pricing, and I might not be coming down as fast as vendors would like us to believe, if the other numbers work out I don’t think they make a huge difference. However, there might still be surprises lurking in there, and if you want to rationally evaluate the effectiveness of a model, you need to be able to measure them and incorporate them as well.

In particular, I really want to stress the importance of the influence of LLMs on your team dynamic, as that can cause massive, hidden increases to C. LLMs present opportunities for junior employees to generate an endless stream of chaff that will simultaneously:

If you’ve already deployed LLM tooling without measuring these things and without updating your performance management processes to account for the strange distortions that these tools make possible, your Futzing Fraction may be much, much greater than 1, creating hidden costs and technical debt that your organization will not notice until a lot of damage has already been done.

If you got all the way here, particularly if you’re someone who is enthusiastic about these technologies, thank you for reading. I appreciate your attention and I am hopeful that if we can start paying attention to these details, perhaps we can all stop futzing around so much with this stuff and get back to doing real work.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!


  1. I do not share this optimism, but I want to try very hard in this particular piece to take it as a given that genAI is in fact helpful. 

  2. If we could have a better prompt on demand via some repeatable and automatable process, surely we would have used a prompt that got the answer we wanted in the first place. 

  3. The software idea of a “user agent” straightforwardly comes from the legal principle of an agent, which has deep roots in common law, jurisprudence, philosophy, and math. When we think of an agent (some software) acting on behalf of a principal (a human user), this historical baggage imputes some important ethical obligations to the developer of the agent software. genAI vendors have been as eager as any software vendor to dodge responsibility for faithfully representing the user’s interests even as there are some indications that at least some courts are not persuaded by this dodge, at least by the consumers of genAI attempting to pass on the responsibility all the way to end users. Perhaps it goes without saying, but I’ll say it anyway: I don’t like this newer interpretation of “agent”. 

  4. “Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents”, Axel Backlund, Lukas Petersson, Feb 20, 2025 

  5. “random thing are happening, maxed out usage on api keys”, @leojr94 on Twitter, Mar 17, 2025 

  6. “New study sheds light on ChatGPT’s alarming interactions with teens” 

  7. “Lawyers submitted bogus case law created by ChatGPT. A judge fined them $5,000”, by Larry Neumeister for the Associated Press, June 22, 2023 

  8. During which a human will be busy-waiting on an answer. 

  9. Given the fluctuating pricing of these products, and fixed subscription overhead, this will obviously need to be amortized; including all the additional terms to actually convert this from your inputs is left as an exercise for the reader. 

  10. I feel like I should emphasize explicitly here that everything is an average over repeated interactions. For example, you might observe that a particular LLM has a low probability of outputting acceptable work on the first prompt, but higher probability on subsequent prompts in the same context, such that it usually takes 4 prompts. For the purposes of this extremely simple closed-form model, we’d still consider that a P of 25%, even though a more sophisticated model, or a monte carlo simulation that sets progressive bounds on the probability, might produce more accurate values. 

  11. No it isn’t, actually, but for the sake of argument let’s grant that it is. 

  12. It’s worth noting that all this expensive measuring itself must be included in C until you have a solid grounding for all your metrics, but let’s optimistically leave all of that out for the sake of simplicity. 

  13. “AI Company Poll Finds 45% of Workers Trust the Tech More Than Their Peers”, by Suzanne Blake for Newsweek, Aug 13, 2025 

  14. AI Chatbots Remain Overconfident — Even When They’re Wrong by Jason Bittel for the Dietrich College of Humanities and Social Sciences at Carnegie Mellon University, July 22, 2025 

  15. AI Mistakes Are Very Different From Human Mistakes by Bruce Schneier and Nathan E. Sanders for IEEE Spectrum, Jan 13, 2025 

  16. Foreshadowing is a narrative device in which a storyteller gives an advance hint of an upcoming event later in the story. 

  17. “People are worried about the misuse of AI, but they trust it more than humans” 

  18. “Why I stopped using AI (as a Senior Software Engineer)”, theSeniorDev YouTube channel, Jun 17, 2025 

  19. “I was an AI evangelist. Now I’m an AI vegan. Here’s why.”, Joe McKay for the greatchatlinkedin YouTube channel, Aug 8, 2025 

  20. “What LLM is The Most Accurate?” 

  21. “Study Finds That 52 Percent Of ChatGPT Answers to Programming Questions are Wrong”, by Sharon Adarlo for Futurism, May 23, 2024 

  22. “Off the Mark: The Pitfalls of Metrics Gaming in AI Progress Races”, by Tabrez Syed on BoxCars AI, Dec 14, 2023 

  23. “I tried coding with AI, I became lazy and stupid”, by Thomasorus, Aug 8, 2025 

  24. “How AI Changes Student Thinking: The Hidden Cognitive Risks” by Timothy Cook for Psychology Today, May 10, 2025 

  25. “Increased AI use linked to eroding critical thinking skills” by Justin Jackson for Phys.org, Jan 13, 2025 

  26. “AI could end my job — Just not the way I expected” by Manuel Artero Anguita on dev.to, Jan 27, 2025 

  27. “The Emerging Problem of “AI Psychosis”” by Gary Drevitch for Psychology Today, July 21, 2025. 

August 15, 2025 07:51 AM UTC


Seth Michael Larson

Nintendo Switch Online “GameCube Classics” is great for digital and physical GameCube players

Nintendo just announced that the GameCube game “Chibi Robo!” would be coming in exactly one week to Nintendo Switch Online + Expansion Pack (NSO+). Chibi Robo is known for being a hidden gem in the GameCube library, and thus quite expensive to obtain a physical copy. This made me think about what other games Nintendo had announced for NSO+ “GameCube Classics”, and I noticed an interesting trend...

5 of the 7 most expensive first-party GameCube titles are either planned or already available on NSO+. The titles that haven't been mentioned yet are “Legend of Zelda: Twilight Princess” and “Cubivore”. Take a look at the table of all 53 first-party GameCube titles below, green (✅) means the game is already available on NSO+ and blue (🔄) means the game is announced but not yet available:

NSO+Game
Price Cubivore$315 🔄Pokémon XD: Gale of Darkness$140 🔄Fire Emblem: Path of Radiance$130 🔄Chibi-Robo!$125 The Legend of Zelda: Twilight Princess$120 🔄Pokémon Colosseum$105 ✅F-Zero GX$65 Eternal Darkness$65 The Legend of Zelda: Four Swords Adventures$60 Mario Party 6$60 Kirby Air Ride$60 Wario World$55 Mario Superstar Baseball$55 Mario Kart: Double Dash!!$55 ✅The Legend of Zelda: The Wind Waker$50 ✅Super Mario Strikers$50 Super Smash Bros. Melee$45 Mario Party 5$45 Mario Party 4$45 Custom Robo$45 Animal Crossing$45 WarioWare, Inc.: Mega Party Game$$40 🔄Luigi's Mansion$40 Baten Kaitos Origins$40 Star Fox Assault$35 Pikmin 2$35 Paper Mario: The Thousand-Year Door$35 Metroid Prime 2: Echoes$35 Mario Party 7$35 Kururin Squash (JP)$35 🔄Super Mario Sunshine$30 Pokémon Channel$30 Pikmin$30 Nintendo Puzzle Collection (JP)$30 Giftpia (JP)$30 Star Fox Adventures$25 Mario Power Tennis$25 Geist$25 Doshin the Giant (JP+PAL)$25 ✅Soul Caliber II$20 Metroid Prime$20 Donkey Kong: Jungle Beat$20 Odama$15 Mario Golf: Toadstool Tour$15 Magical Mirror Starring Mickey Mouse$15 Final Fantasy: Crystal Chronicles$15 Donkey Konga 2$15 Donkey Konga$15 Dance Dance Revolution: Mario Mix$15 Battalion Wars$15 1080° Avalanche$15 Wave Race: Blue Storm$10 NBA Courtside 2002$10 Totals ✅Total Price of “Available”$185 ✅🔄Total Price of “Planned & Available”$755

As much as I am a lover of physical media, the reality is that collecting physical copies of some GameCube titles is prohibitively expensive. I was never going to shell out over $125 to play Chibi Robo. $50 for a year of NSO+ means that I and many others can experience new games in the GameCube library for the first time.

Rare games being made available digitally also tends to reduce the price of physical games. If a game hasn't been released in any format except the original, anyone who wants to play the game legally has to buy a physical copy. If that same game is also available digitally, then the physical game doesn't have nearly as much demand.

We saw this occur just last year with “Paper Mario: The Thousand Year Door” which saw a ~60% price drop after the Switch remake was published, going from $90 in April 2023 to $35 today. This is a boon for collectors and players that are interested in the physical media.

I'll try to keep this table updated as more games are made available and announced. We'll see where the prices of top GameCube titles go from here. If you're interested, here is my script for generating the above table.



Thanks for keeping RSS alive! ♥

August 15, 2025 12:00 AM UTC

August 14, 2025


Python Insider

Python 3.14.0rc2 and 3.13.7 are go!

Not one but two expedited releases! 🎉 🎉

Python 3.14.0rc2

It’s the final 🪄 penultimate 🪄 3.14 release candidate!

https://www.python.org/downloads/release/python-3140rc2/

Note: rc2 was originally planned for 2025-08-26, but we fixed a bug that required bumping the magic number stored in Python bytecode (.pyc) files. This means .pyc files created for rc1 cannot be used for rc2, and they’ll be recompiled.

The ABI isn’t changing. Wheels built for rc1 should be fine for rc2, rc3 and 3.14.x. So this shouldn’t affect too many people but let’s get this out for testing sooner.

Due to this early release, we’ll also add a third release candidate between now and the final 3.14.0 release, with no planned change to the final release date.

This release, 3.14.0rc2, is the penultimate release preview. Entering the release candidate phase, only reviewed code changes which are clear bug fixes are allowed between this release candidate and the final release.

The next pre-release of Python 3.14 will be the final release candidate, 3.14.0rc3, scheduled for 2025-09-16; the official release of 3.14.0 is scheduled for Tuesday, 2025-10-07.

There will be no ABI changes from this point forward in the 3.14 series, and the goal is that there will be as few code changes as possible.

Call to action

We strongly encourage maintainers of third-party Python projects to prepare their projects for 3.14 during this phase, and publish Python 3.14 wheels on PyPI to be ready for the final release of 3.14.0, and to help other projects do their own testing. Any binary wheels built against Python 3.14.0 release candidates will work with future versions of Python 3.14. As always, report any issues to the Python bug tracker.

Please keep in mind that this is a preview release and while it’s as close to the final release as we can get it, its use is not recommended for production environments.

Core developers: time to work on documentation now

Major new features of the 3.14 series, compared to 3.13

Some of the major new features and changes in Python 3.14 are:

New features

(Hey, fellow core developer, if a feature you find important is missing from this list, let Hugo know.)

For more details on the changes to Python 3.14, see What’s new in Python 3.14.

Build changes

Note that Android binaries are new in rc2!

Incompatible changes, removals and new deprecations

Python install manager

The installer we offer for Windows is being replaced by our new install manager, which can be installed from the Windows Store or from its download page. See our documentation for more information. The JSON file available for download below contains the list of all the installable packages available as part of this release, including file URLs and hashes, but is not required to install the latest release. The traditional installer will remain available throughout the 3.14 and 3.15 releases.

Python 3.13.7

This is the seventh maintenance release of Python 3.13

https://www.python.org/downloads/release/python-3137/

Python 3.13 is the newest major release of the Python programming language, and it contains many new features and optimizations compared to Python 3.12. 3.13.7 is the seventh maintenance release of 3.13.

3.13.7 is an expedited release to fix a significant issue with the 3.13.6 release:

A few other bug fixes (which would otherwise have waited until the next release) are also included.

More resources

And now for something completely different

The magpie, Pica pica in Latin, is a black and white bird in the crow family, known for its chattering call.

The first-known use in English is from a 1589 poem, where magpie is spelled “magpy” and cuckoo is “cookow”:

Th[e]y fly to wood like breeding hauke,
  And leave old neighbours loue,
They pearch themselves in syluane lodge,
  And soare in th' aire aboue.
There : magpy teacheth them to chat,
  And cookow soone doth hit them pat.

The name comes from Mag, short for Margery or Margaret (compare robin redbreast, jenny wren, and its corvid relative jackdaw); and pie, a magpie or other bird with black and white (or pied) plumage. The sea-pie (1552) is the oystercatcher, the grey pie (1678) and murdering pie (1688) is the great grey shrike. Others birds include the yellow and black pie, red-billed pie, wandering tree-pie, and river pie. The rain-pie, wood-pie and French pie are woodpeckers.

Pie on its own dates to before 1225, and comes from the Latin name for the bird, pica.

Enjoy the new releases

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organisation contributions to the Python Software Foundation.

Regards from a busy Helsinki on Night of the Arts,

Your release team,
Hugo van Kemenade
Thomas Wouters
Ned Deily
Steve Dower
Łukasz Langa

August 14, 2025 03:37 PM UTC


Python Software Foundation

Announcing the PSF Board Candidates for 2025!

What an exciting list! Please take a look at who is running for the PSF Board this year on the 2025 Nominees page. This year there are 4 seats open on the PSF Board. You can see who is currently on the board on the PSF Officers & Directors page. (Dawn Wages, Jannis Leidel, Kushal Das, and Simon Willison are at the end of their current terms.) 

Board Election Timeline:

Not sure what UTC is for you locally? Check this time converter

Reminder to affirm your intention to vote!

If you wish to vote in this year’s election, you must affirm your intention to vote no later than Tuesday, August 26th, 2:00 pm UTC, to participate in this year’s election. This year’s Board Election vote begins Tuesday, September 2nd, 2:00 pm UTC, and closes on Tuesday, September 16th, 2:00 pm UTC. 


Every PSF Voting Member (Supporting, Contributing, and Fellow) needs to affirm their membership to vote in this year’s election. You should have received an email from "psf@psfmember.org <Python Software Foundation>" with the subject "[Action Required] Affirm your PSF Membership voting intention for 2025 PSF Board Election" that contains information on how to affirm your voting status. 

Find more information, including step-by-step instructions on voting affirmation, in our  ‘Affirm Your PSF Membership Voting Status” blog post. If you run into any issues, please email psf-elections@pyfound.org.

Voting: what to expect

If you are a voting member of the PSF that affirmed your intention to participate in this year’s election, you will receive an email from “OpaVote Voting Link <noreply@opavote.com>” with your ballot, the subject line will read “Python Software Foundation Board of Directors Election 2025” on September 2nd. If you don’t receive a ballot as expected, please first check your spam folder for a message from “noreply@opavote.com”. If you don’t see anything get in touch by emailing psf-elections@pyfound.org so we can look into your account and make sure we have the most up-to-date email for you.


If you have questions about your membership status or the election, please email psf-elections@pyfound.org. You are welcome to join the discussion about the 2025 PSF Board election on the Python Discuss forum.

August 14, 2025 12:19 PM UTC

August 13, 2025


Ari Lamstein

🚀 Join Me Tonight for a Hands-On Streamlit Workshop!

I’m excited to announce that I’ll be running a live workshop tonight on Streamlit—the Python framework that makes it easy to build interactive web apps for data projects.

This session is designed for those with an interest in data apps and no prior experience with Streamlit. You will walk through the fundamentals with a practical, beginner-friendly tutorial. We’ll cover:

All participants will leave with a finished app deployed to the internet, which you can share with friends.

The workshop is hosted by SF Python and kicks off tonight. You can RSVP and find all the event details here.

To follow along or preview the content, check out the workshop’s GitHub repo. This repo includes the full codebase, setup instructions, and exercises we’ll be working through. Feel free to fork and clone it ahead of time and come with questions!

Looking forward to seeing you there and building something great together.

August 13, 2025 04:37 PM UTC


Django Weblog

Building better APIs: from Django to client libraries with OpenAPI

tl;dr

A summary of resources and learnings related to building REST API I put together over the last couple of years. Complete API development workflow from Django backend to frontend clients using Django REST Framework, drf-spectacular for OpenAPI spec generation, and automated client generation with openapi-generator. Big productivity boost!

There is a lot of discussion about frameworks for building REST APIs, some of them being even able to generate OpenAPI specs directly for you. Django is not quite known for that, but there are ways of doing this by automating most of the process while being very productive and offering your team a clean developer experience.

Overview

The stack I prefer makes use of several additional modules you will require: django-rest-framework and drf-spectacular alongside Django. REST Framework helps you extend your application in order to have a REST API, while drf-spectacular will help you the ability to generate the OpenAPI spec (standalone post: Create OpenAPI spec for Django REST Framework APIs.

After having the OpenAPI spec, you can generate clients with openapi-generator. Here is an example I mapped out of generating an Angular client:

Diagram of the flow from OpenAPI, to code via openapi-generator, then npm publish, live on npm, them npm install api-client, then Angular

Step-by-step process

There is also a recording from my GLT 2025 talk where I summarize most of these ideas.

Building better APIs from Django to Client Libraries with OpenAPI by Harald Nezbeda Building Better APIs - From Django to Client Libraries with OpenAPI

In case you want to follow along, here is a step-by-step guide from the repository I showed during the presentation:

From the last step, you can generate the API clients for the platform you require. You can follow the README and the examples available in my glt25-client repository.

Maintaining compatibility over time

The final tool you can use is openapi-diff, which will help you keep your documentation compatible. This is very important once your REST API is used in production:

Example of a compatible change: glt25-demo v1 to v2

docker run --rm -t openapitools/openapi-diff:latest https://github.com/nezhar/glt25-demo/releases/download/v1/openapi.yaml https://github.com/nezhar/glt25-demo/releases/download/v2/openapi.yaml

Example of a breaking change: glt25-demo v2 to v3

docker run --rm -t openapitools/openapi-diff:latest https://github.com/nezhar/glt25-demo/releases/download/v2/openapi.yaml https://github.com/nezhar/glt25-demo/releases/download/v3/openapi.yaml

Automating the maintenance

The process can be automated even further using GitHub Actions and Dependabot. Here are what the steps look like with this full continuous delivery setup:

GitHub Release Published, then Generate OpenAPI Schema, then Upload Schema to Release, then Build Client, then Publish to npm

Takeways

Building a complete API development workflow from Django to client libraries using OpenAPI creates a powerful and maintainable development experience. By combining Django REST Framework with drf-spectacular for automatic OpenAPI spec generation and openapi-generator for client creation, you can eliminate manual API documentation and reduce integration errors.

If you want to go even further, you can automate the integration of error codes inside the OpenAPI spec. This way you can better support languages that are even more strict when consuming the REST API!


Thank you to Harald Nezbeda for proposing this guest post on the Django blog!

August 13, 2025 02:03 PM UTC


Real Python

Python's with Statement: Manage External Resources Safely

Python’s with statement allows you to manage external resources safely by using objects that support the context manager protocol. These objects automatically handle the setup and cleanup phases of common operations.

By using the with statement alongside appropriate context managers, you can focus on your core logic while the context managers prevent resource leaks like unclosed files, unreleased memory, or dangling network connections.

By the end of this tutorial, you’ll understand that:

  • Python’s with statement automates the process of setting up and tearing down computational resources using context managers.
  • Using with reduces code complexity and prevents resource leaks by ensuring proper resource release, even if exceptions occur.
  • A context manager in Python is an object that implements .__enter__() and .__exit__() methods to manage resources safely.

Get ready to learn how Python’s with statement and context managers streamline the setup and teardown phases of resource management so you can write safer, more reliable code.

Get Your Code: Click here to download the free sample code that shows you how to use Python’s with statement to manage external resources safely.

Take the Quiz: Test your knowledge with our interactive “Context Managers and Python's with Statement” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Context Managers and Python's with Statement

Test your knowledge of Python's with statement and context managers to write cleaner code and manage resources safely and efficiently.

Managing External Resources in Python

Properly managing external resources, such as files, locks, and network connections, is a common requirement in programming. Sometimes, a program uses a given resource and doesn’t release the associated memory when it no longer needs the resource. This kind of issue is called a memory leak because the available memory shrinks every time you create a new instance of a resource without releasing the unneeded ones.

Managing resources properly is often a tricky task. It requires setup and teardown phases. The latter phase requires you to perform cleanup actions, like closing a file, releasing a lock, or closing a network connection. If you forget to perform these cleanup actions, then your application keeps the resource occupied. This behavior might compromise valuable system resources, such as memory and network bandwidth.

For example, say that a program that uses databases keeps creating new connections without releasing the old ones or reusing them. In that case, the database back end can stop accepting new connections. This might require an administrator to log in and manually terminate those stale connections to make the database usable again.

Another common issue occurs when developers work with files. Writing text to files is usually a buffered operation. This means that calling .write() on a file won’t immediately result in writing text to the physical file, but to a temporary buffer. Sometimes, when the buffer isn’t full, developers forget to call .close() and part of the data can be lost.

Another possibility is that your application runs into errors or exceptions that cause the control flow to bypass the code responsible for releasing the resource at hand. Here’s an example where you use the built-in open() function to write some text to a file:

Python
file = open("hello.txt", "w")
file.write("Hello, World!")
file.close()
Copied!

This code doesn’t guarantee the file will be closed if an exception occurs during the call to .write(). In this situation, the code might never call .close(), and your program will leak a file descriptor. Failing to release a file descriptor on some operating systems can prevent other programs from accessing the underlying file.

Note: To learn more about closing files, check out the Why Is It Important to Close Files in Python? tutorial.

In Python, you can use a couple of general approaches to deal with resource management. You can wrap your code in:

  1. A tryfinally construct
  2. A with construct

The first approach is quite generic and allows you to provide setup and teardown code to manage any kind of resource. However, it’s a little bit verbose, and you might forget some cleanup actions if you use this construct in several places.

The second approach provides a straightforward way to provide and reuse setup and teardown code. In this case, you’ll have the limitation that the with statement only works with context managers. In the next two sections, you’ll learn how to use both approaches in your code.

The tryfinally Construct

Working with files is probably the most common example of resource management in programming. In Python, you can use a tryfinally construct to handle opening and closing files properly:

Python
file = open("hello.txt", "w")

try:
    file.write("Hello, World!")
finally:
    file.close()
Copied!

In this example, you open the hello.txt file using open(). To write some text into the file, you wrap the call to .write() in a try statement with a finally clause. This clause guarantees that the file is properly closed by calling .close(), even if an exception occurs during the call to .write() in the try clause. Remember that the finally clause always runs.

When managing external resources in Python, you can use the construct in the previous example to handle setup and teardown logic. The setup logic might include opening the file and writing content to it, while the teardown logic might consist of closing the file to release the acquired resources.

Read the full article at https://realpython.com/python-with-statement/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 13, 2025 02:00 PM UTC


PyCharm

Lightning-Fast Python: Mastering the uv Package Manager

Check out our recent livestream from August 7 on uv, a next-generation Python package manager revolutionizing Python development with speeds 10-100x faster than pip. Released just last year and written in Rust, uv replaces long-standing Python dependency management challenges with a single, lightning-fast tool that just works.


Our speakers:

* Michael Kennedy, host of Talk Python and Python Bytes podcasts, founder of Talk Python Training

* William Vincent, PyCharm Developer Advocate, Django Chat podcast host

Join us as we demonstrate uv‘s game-changing capabilities live, explore practical migration strategies, and discuss advanced features. We’ll show you why Python developers are rapidly adopting this breakthrough tool as their go-to package manager.

August 13, 2025 01:13 PM UTC