skip to navigation
skip to content

Planet Python

Last update: June 19, 2025 07:41 PM UTC

June 19, 2025


PyCharm

Training Your ML Models With Cadence

In the rapidly evolving domains of machine learning (ML) and artificial intelligence (AI), the tools and technologies used by developers can significantly influence the speed, efficiency, and effectiveness of their projects. Recognizing this, we introduced Cadence in PyCharm 2025.1, a plugin that merges the ease of local development with advanced cloud computing capabilities.

Why Cadence?

Cadence makes it possible to run your code on powerful cloud hardware directly from PyCharm. This integration alleviates the typical complexities and extensive setup usually associated with cloud computing. Whether you’re a solo developer experimenting with new models or part of a larger team pushing the boundaries of ML applications, Cadence ensures that your transition to powerful cloud resources is seamless and straightforward.

Serverless computing on demand

Reduce overhead with Cadence’s serverless computing options, allowing you to access and manage GPUs with transparent and predictable per-second billing. This removes the need for significant upfront investments in hardware, making advanced computing power accessible at any scale.

Run your code as is

With Cadence, your existing PyCharm projects require no modifications to fit into the cloud environment. Upload and execute your code as usual; Cadence handles all of the adjustments on the back end, ensuring your cloud session feels like an extension of your local setup.

Tailored for PyCharm users

Debug and deploy using the PyCharm interface you’re familiar with. Set breakpoints, monitor outputs, and interact with your remote environment with no additional learning curve.

Data management simplified

Say goodbye to manual data transfers. Cadence automatically synchronizes your projects’ data to the cloud, allowing you to download the results of each experiment directly in the IDE.

Reliable experimentation

Review, refine, and rerun your past experiments. Cadence provides consistent replication of results, facilitating continuous improvements.

Optimized resource allocation

Choose from a wide array of cloud settings, including configurations like 8xA100 and 8xH100, to scale your resources according to project demands. Schedule as many tasks as you need simultaneously, and Cadence will automatically check for available hosts in different regions and zones.

Ready for teams

Adopting Cadence isn’t just about improving individual productivity; it’s about enhancing team dynamics and output. Share setup configurations, results, and insights effortlessly within your team. 

Getting started with Cadence

You can try Cadence for free with a USD 30 welcome credit by installing the plugin from JetBrains Marketplace or by enabling it directly in PyCharm via Settings | Plugins | Marketplace

To see how easy it is to start training your ML models in PyCharm, check out this tutorial video.

June 19, 2025 12:17 PM UTC

June 18, 2025


Talk Python Blog

New Theme Song: Served In A Flask

Those of you who were early listeners of Talk Python To Me might remember the amazing theme song we launched with: Developers, Developers, Developers by Smixx. Thanks to Smixx for letting us use his music for our intros.

Over the years, people have asked “What happened to the rap song”? I took it down for a couple of reasons not worth digging into but have definitely missed the fun and irreverant intro to the show.

June 18, 2025 06:55 PM UTC


Real Python

Python Project: Build a Word Count Command-Line App

The word count command (wc) is a classic utility that you might use to determine the number of lines, words, and bytes in files or standard input. It’s a staple tool for anyone working with text files on Unix-like systems. But have you ever wondered how such a tool is designed and implemented?

In this practice exercise, you’ll dive into the inner workings of the Unix wc command by building its simplified version from scratch using Python. Not only will this coding challenge solidify your understanding of file handling and text processing, but it’ll also give you a taste of how to structure command-line utilities in Python.

By the end of this challenge, you’ll have a functional version of the wc command that can faithfully reproduce the outputs you’re accustomed to seeing in a Unix terminal. However, it won’t be an exact replica of the wc command, as you’ll omit or adapt some features for simplicity.

In this coding challenge, you’ll:

While working on this challenge, you’ll gain hands-on experience with several modules from Python’s standard library, such as pathlib for manipulating the file system and argparse for parsing command-line arguments. Familiarity with basic Python programming and file handling will be beneficial, but detailed instructions and helpful tips will guide you through each step of the process.

The challenge is broken down into a number of tasks, each accompanied by clear instructions and illustrative examples. You’ll receive automated feedback on your solutions when you follow along using the accompanying materials. If you run into any issues or have questions, then don’t hesitate to ask for help in the comments section below the corresponding lesson.

Note: You can also reach out to the Real Python community on Slack or join live conversations during Office Hours, where you’ll have an opportunity to share your screen remotely.

Completing each task unlocks the next one. Once you’ve completed a task, you can compare your code with the sample solution provided in the following lesson. Remember that there’s often more than one way to solve a problem. If your solution differs slightly but meets the acceptance criteria and adheres to good programming practices, then that’s perfectly fine.

Good luck!


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 18, 2025 02:00 PM UTC


Talk Python to Me

#510: 10 Polars Tools and Techniques To Level Up Your Data Science

Are you using Polars for your data science work? Maybe you've been sticking with the tried-and-true Pandas? There are many benefits to Polars directly of course. But you might not be aware of all the excellent tools and libraries that make Polars even better. Examples include Patito which combines Pydantic and Polars for data validation and polars_encryption which adds AES encryption to selected columns. We have Christopher Trudeau back on Talk Python To Me to tell us about his list of excellent libraries to power up your Polars game and we also talk a bit about his new Polars course.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/agntcy'>Agntcy</a><br> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code TALKPYTHON</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading">Links from the show</h2> <div><strong>New Theme Song (Full-Length Download and backstory)</strong>: <a href="https://talkpython.fm/flasksong/" target="_blank" >talkpython.fm/blog</a><br/> <br/> <strong>Polars for Power Users Course</strong>: <a href="https://training.talkpython.fm/courses/polars-for-power-users" target="_blank" >training.talkpython.fm</a><br/> <strong>Awesome Polars</strong>: <a href="https://github.com/ddotta/awesome-polars?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Polars Visualization with Plotly</strong>: <a href="https://docs.pola.rs/user-guide/misc/visualization/#plotly" target="_blank" >docs.pola.rs</a><br/> <strong>Dataframely</strong>: <a href="https://github.com/Quantco/dataframely?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Patito</strong>: <a href="https://github.com/JakobGM/patito?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars_iptools</strong>: <a href="https://github.com/erichutchins/polars_iptools?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars-fuzzy-match</strong>: <a href="https://github.com/bnmoch3/polars-fuzzy-match?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Nucleo Fuzzy Matcher</strong>: <a href="https://github.com/helix-editor/nucleo?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars-strsim</strong>: <a href="https://github.com/foxcroftjn/polars-strsim?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars_encryption</strong>: <a href="https://github.com/zlobendog/polars_encryption?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars-xdt</strong>: <a href="https://github.com/pola-rs/polars-xdt?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>polars_ols</strong>: <a href="https://github.com/azmyrajab/polars_ols?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Least Mean Squares Filter in Signal Processing</strong>: <a href="https://www.geeksforgeeks.org/least-mean-squares-filter-in-signal-processing/?featured_on=talkpython" target="_blank" >www.geeksforgeeks.org</a><br/> <strong>polars-pairing</strong>: <a href="https://github.com/apcamargo/polars-pairing?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Pairing Function</strong>: <a href="https://en.wikipedia.org/wiki/Pairing_function?featured_on=talkpython" target="_blank" >en.wikipedia.org</a><br/> <strong>polars_list_utils</strong>: <a href="https://github.com/dashdeckers/polars_list_utils?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Harley Schema Helpers</strong>: <a href="https://tomburdge.github.io/harley/reference/harley/schema_helpers/?featured_on=talkpython" target="_blank" >tomburdge.github.io</a><br/> <strong>Marimo Reactive Notebooks Episode</strong>: <a href="https://talkpython.fm/episodes/show/501/marimo-reactive-notebooks-for-python#links-section" target="_blank" >talkpython.fm</a><br/> <strong>Marimo</strong>: <a href="https://marimo.io/?featured_on=talkpython" target="_blank" >marimo.io</a><br/> <strong>Ahoy Narwhals Podcast Episode Links</strong>: <a href="https://talkpython.fm/episodes/show/480/ahoy-narwhals-are-bridging-the-data-science-apis" target="_blank" >talkpython.fm</a><br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=aIdvlJN1bNQ" target="_blank" >youtube.com</a><br/> <strong>Episode #510 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/510/10-polars-tools-and-techniques-to-level-up-your-data-science#takeaways-anchor" target="_blank" >talkpython.fm/510</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/510/10-polars-tools-and-techniques-to-level-up-your-data-science" target="_blank" >talkpython.fm</a><br/> <br/> <strong>--- Stay in touch with us ---</strong><br/> <strong>Subscribe to Talk Python on YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" >youtube.com</a><br/> <strong>Talk Python on Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm at bsky.app</a><br/> <strong>Talk Python on Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes at bsky.app</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>

June 18, 2025 08:00 AM UTC

June 17, 2025


PyCoder’s Weekly

Issue #686: Free-Threaded Update, GPU Programming, GitHub Actions, and More (June 17, 2025)

#686 – JUNE 17, 2025
View in Browser »

The PyCoder’s Weekly Logo


State of Free-Threaded Python

This is a blog post from the Python Language Summit 2025 giving an update on the progress of free-threaded Python. You may also be interested in the complete list of Language Summit Blogs.
PYTHON SOFTWARE FOUNDATION

GPU Programming in Pure Python

Talk Python interviews Bryce Adelstein Lelbach and they talk about using Python to harness the insane power of modern GPUs for data science and ML.
KENNEDY & LELBACH podcast

Making Friends with Agents: A Mental Model for Agentic AI

alt

Explore a mental model to befriend your AI agent. This blog walks through designing goal-driven, tool-savvy agents that think in loops, speak your language, and bounce back from failure through durable execution →
TEMPORAL sponsor

Continuous Integration and Deployment Using GitHub Actions

Agile methodologies rely on robust DevOps systems to manage and automate common tasks in a continually changing codebase. GitHub Actions can help.
REAL PYTHON course

NumPy v2.3.0 Released

GITHUB.COM/NUMPY

Call for Applicants for a Django Fellow

DJANGO SOFTWARE FOUNDATION

Django Bugfix Releases: 5.2.3, 5.1.11, and 4.2.23

DJANGO SOFTWARE FOUNDATION

Python 3.13.5 Released

PYTHON.ORG

scikit-learn 1.7 Released

SCIKIT-LEARN.ORG

Python Jobs

Sr. Software Developer (Python, Healthcare) (USA)

Prenosis

Senior Software Engineer – Quant Investment Platform (LA or Dallas) (Los Angeles, CA, USA)

Causeway Capital Management LLC

More Python Jobs >>>

Articles & Tutorials

A dict That Can Report Which Keys Weren’t Used

When testing, you may want to make sure that all parts of a dictionary get accessed to get full coverage. This post shows a modified dict that tracks which keys got used.
PETER BENGTSSON

Better Django Management Commands

Writing Django management commands can involve a ton of boilerplate code. This article shows you how to use two libraries that could cut your management command code in half: django-click and django-typer.
REVSYS

Easy-to-Deploy, Enterprise-Ready GenAI

Check out the Intel GenAI code library for ready-to-deploy and easy-to-integrate solutions.
INTEL CORPORATION sponsor

How Can You Structure Your Python Script?

Structure your Python script like a pro. This guide shows you how to organize your code, manage dependencies with PEP 723, and handle command-line arguments.
REAL PYTHON

Quiz: How Can You Structure Your Python Script?

In this quiz, you’ll test your understanding of organizing and structuring Python scripts. You’ll revisit key concepts about best practices for writing clear, maintainable, and executable Python code.
REAL PYTHON

Wyvern’s Open Satellite Feed

Wyvern is a satellite startup who has recently launched an open data program. This article plays with that data using Python libraries such as astropy, geocoder, rich and more.
MARKSBLOGG.COM

Pointblank: Data Validation Made Beautiful

This post introduces pointblank a library for doing data validation. It includes chainable execution and interactive reports to see what is working in your data pipeline.
POSIT-DEV.GITHUB.IO

5 Non-LLM Software Trends to Be Excited About

Tired of reading about AI and LLMs? This post talks about other tech that is rapidly changing in the software world, including local-first applications, web assembly, the improvement of cross-platform tools, and more.
LEONARDO CREED

Concurrency in async/await and Threading

Want to write faster Python code? Discover the difference between async/await and threading and how concurrency works in Python with real-world examples.
CHEUK TING HO

Defining Your Own Python Function

Learn how to define your own Python function, pass data into it, and return results to write clean, reusable code in your programs.
REAL PYTHON

Quiz: Defining Your Own Python Function

REAL PYTHON

TIL: HTML 404 Errors for FastHTML

A quick “Things I’ve Learned” post showing how to write a custom HTTP 404 handler for FastHTML.
DANIEL ROY GREENFIELD

PyData Virginia 2025 Talks

A list of the recorded talks from PyData Virginia 2025.
YOUTUBE.COM video

Projects & Code

py-shiny: Shiny for Python Web Apps

GITHUB.COM/POSIT-DEV

quarto-cli: Scientific and Technical Publishing System

GITHUB.COM/QUARTO-DEV

paramiko: Native Python SSHv2 Library

GITHUB.COM/PARAMIKO

toolz: A Functional Standard Library for Python

GITHUB.COM/PYTOOLZ

ahocorasick_rs: Check for Multiple Patterns in a Single String

GITHUB.COM/G-RESEARCH

Events

Weekly Real Python Office Hours Q&A (Virtual)

June 18, 2025
REALPYTHON.COM

PyData Bristol Meetup

June 19, 2025
MEETUP.COM

PyLadies Dublin

June 19, 2025
PYLADIES.COM

Python Nordeste 2025

June 20 to June 23, 2025
PYTHONNORDESTE.ORG

Python Coding Club for Teens (PyTahoua)

June 20 to June 23, 2025
PYTHONNIGER.ORG


Happy Pythoning!
This was PyCoder’s Weekly Issue #686.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

June 17, 2025 07:30 PM UTC


Adrarsh Divakaran

Will AI Replace Junior Developers? I Asked Experts at Pycon US

June 17, 2025 05:42 PM UTC


Django Weblog

DSF member of the month - Elena Williams

For June 2025, we welcome Elena Williams as our DSF member of the month! ⭐

Elena in DjangoGirls Brisbane

Elena is a dedicated member of the Django community. She is part of the Code of Conduct Working Group and she is a Django Girls organizer in Australia. She has been a DSF member since July 2014.
You can learn more about Elena by visiting Elena's website and her GitHub Profile.

Let’s spend some time getting to know Elena better!

Can you tell us a little about yourself (hobbies, education, etc)

My background is that I was always interested in computers, though my parents were more creative types, my Dad was an Architect (of built structures). When I was a kid we had computers for CAD around the house before it was common. I was always into STEM subjects, but unfortunately in that era for girls to do engineering it was a bit too hostile for me, so I trained in finance instead and worked in that industry (finance and banking, MNE orgs) for nearly a decade. I kept coming back to coding and was always building computers, and was obsessed with the internet as a technology from pretty early on. Just after I discovered Django I did a Masters in Computing at ANU. To this day my main hobbies are programming/webdev (very much a person who codes for fun) and the open source community. My persistent other hobbies are hackspace activities, I like CNC and laser stuff, but will pick up any and all tools/mediums and give them a go, lately been spending time with blender and cabinetry. When I can, I like to get away to snowboard or kitesurf, and this wild Australian long distance endurance navigation sport called rogaining. Really at the moment I’m mostly focussed on being a parent (which is an awesome experience), my friends (mostly python related), my job and working on the community here in Australia. With my family we go camping/hiking more than most. I’ve also continued to be a sessional academic at ANU teaching software engineering for many years.

How did you start using Django?

Golly, I’ve been using Django forever. I’d started doing web stuff in the early ‘00s and worked in a range of languages and paradigms. I was working in a physics research institute at a high profile university in Australia doing web stuff and made friends with a bunch of the doctoral students. In around 2007, one of these students, and my good mate, Dave, randomly recommended this new framework Django and Python (and emacs also actually but that’s a different story). Basically I got into it immediately and never looked back and went on to build a career around Django (actually Dave later gave up physics and did the same thing too). I’ve been engaged with the Python and Django communities to varying degrees since about 2011 as well. To be honest when I discovered the language and the framework I really didn’t expect to still be passionate about them all these years later but I really am! Hopefully I can continue to be well into the future also.

What other framework do you know and if there is anything you would like to have in Django if you had magical powers?

Over the years (being a curious person) I’ve worked with many many web frameworks and technologies, the vast majority of the big ones. In recent years I’ve been spending time with FastAPI and SQLAlchemy as well as non-python technologies. Django is better though.

Not using Django as much at the moment makes me love it even more and realise how lucky we are with such a well designed and well supported framework. It’s not perfect but it’s outstanding.

Having said that: at a technical level I’d love to have “cheaper” ways (in every sense) to deploy. Even though deployment methods have changed beyond recognition several times over the years, I always thought this would get easier over time and am kind of surprised that it hasn’t.

Very specific to me is that I need Django to have stronger support for many database schemas in the same project, but honestly this is just a specific problem I have inherited in a project at the moment, but it’ll pass eventually.

What projects are you working on now?

Over the last few years I’ve helped organise a number of events, including PyConAU, though realised I’d been taking on too many projects and trying to pull back actually! Still: Internationally I’m on DSF CoC with a great team. Nationally this year I’ve been serving on the committee of our main Australian open source foundation body, Linux Australia, as well as working in a small team trying to bring together all of the Australian python user groups under a banner we hope to call Python Australia and I’ve had a keen interest in python user groups around the world. In my home town I’ve been organising our local user groups for some time with an awesome team, as well as our fantastic local PyLadies.

For work I’m flat-chat working in a senior role on a Platform team in a small data company that provides “critical digital infrastructure” for Australia. Though my most important project of all at the moment really is my family, and I do really prioritise my friends and being healthy nowadays. I’m an avid hackerspace person and do have a couple of purportedly active projects (I’m obsessed with maps among other things) but these are relatively neglected at the moment as I just don’t have the bandwidth.

Which Django libraries are your favorite (core or 3rd party)?

I just love the ORM. We’re so spoiled in the Django community we don’t realise how mature and feature-rich the ORM is. Maybe I’m biased because I’ve been using it for so long I just “think” in Django ORM and I’ve been working away from it lately. It’s such a (comparative) pleasure to use. You can nit-pick at it but compared to anything else it’s so beautifully thought through.

The admin was the Django “killer app” in 2008 and I’d argue still is in 2025. To be some dozens of seconds away from a custom CMS backend at any time is still magical. Pony magical. It’s still as impressive as ever to show off to people. Also in the same way that Guido says python makes a great calculator: Django makes a great quick tool for really fast data munging, can’t describe how liberating it feels using it for this purpose.

Writing tests in Django is under-rated too.

There are so many amazing 3rd party libraries, too many to mention. For shout-outs I don’t think I have any projects without Debug Toolbar. The 3rd party caching libraries Memcache and Redis are both great. I’m also usually happy when I turn on Celery, and excited to see DEP-0014 on its way. Danny and Audrey’s Django Cookiecutter project is a great reference even if you don’t take the whole enchilada.

What are the top three things in Django that you like?

I’ve been lucky to generally have had a pretty great time with Django. Generally I’ve used it for projects where it was a really good fit and so it wasn’t painful. As such I like weird little quirky things about Django. Haters-can-hate but I actually really like a bunch of this controversial stuff, for example I like settings.py as a pattern for projects that aren’t out of control; I enjoy using and customising the management commands framework; I think Meta class as an approach to that type of awkward problem is neat; I’ve generally had a pretty nice time with the template language; I dig into utils and reuse them probably more often than most; ORM and the Tests obviously (it’s trivial to plugin pytest of course). Everything is a trade-off in software engineering and while I’m very biased: I just like the trade-offs that Django has chosen, they’re some of the best-in-class.

The top 3 things though? This is tough. I just like it. To nail down actual answers though:

I know you have start Django with one of the first version, what do you think of the evolution of the framework?

This is a great question! Thanks for being interested in this history, the Django history is a nice story of having good values and persisting and this actually being successful over the long run.

For me there’s all the “back in my day” stuff that’s not obvious now, like Python not being taken seriously as a “real” programming language, let alone javascript, but now those tides have very much turned, and web development is considered extremely respectable and high profile, which was unimaginable when I started. Django started in Web1.0 (whatever that meant), and actually grew substantially during Web2.0 and now even in the modern Web3 era is kind of establishing itself into being part of the backbone of the large parts of the internet that aren’t obvious. Thibaud has a list he maintains of websites that he believes use Django, this is great if you haven’t seen it.

One of the most impressive parts of the evolution has been how decisions have been made and implemented. In normal “work” you just have to make things as fast as possible and endlessly add features consequences-be-damned. Open source gets to be fundamentally the opposite. Traditionally one of the defining characteristics of Open Source is that “time is no object”. That is good design and implementation can be allowed the time to breathe and be excessively thought through. There is no rush or deadline. While there’s always conflict and drama I think there has been less so in Django than in most other projects as design decisions have been painstakingly threshed out and perfected in mailing lists, tickets, DEPs and forums over the months and years it takes to make them. The people inside see the drama but we’re in the news almost never compared to most projects in the same space. The point is that hypothetically it’s possible to try to make the best possible design decisions. In practice most projects don’t do this, but I think Django has demonstrated exemplary maturity in trying to pursue this ideal, and is regularly recognised for it.

The original founding team deserve full credit for instilling this culture and each successive group of stewards deserve credit for preserving it.

There have (and always will be) missteps. For example CBVs are such an obviously good idea on paper, but in practice people don’t think so. On the other hand Andrew Godwin’s implementation of migrations back in the day, that was completely re-writing South from scratch, was truly lovely, even though it was a battle to get to the point of having migrations at all. There’s the history around the db module, which pretty much everyone was too scared to touch after Malcolm died until there were some impressive breakthroughs in it during the “under the hood” sessions not long after DjangoGirls people started coming on board.

Django consciously has decided to be extremely considered in its adoption of change and this has been a great thing. Other frameworks have generally been more cavalier, while Django has been steady, careful and reliable. The other full-feature frameworks are kind of in decline, or have hurt themselves by too-much-change-too-fast, while Django has steadily slowly grown and is the trusty go-to tool for a certain kind of job.

Now moving forward I see focus on the very subtle things that make the framework nicer to use and understand, On just making the core capabilities better and more reliable and performant, and only very very carefully adding features.

In an age where so much quality degradation is occurring, it inspires hope that projects like Django can persist as beacons of high quality, held together by a small group and big community of thoughtful, caring individuals. Hopefully this is something we can continue for a long time into the future also!

You are part of the Code of Conduct working group, how is it to work with the working group? Do you have space available for new members? What does it require according to you?

Code of Conduct WGs are slightly niche and exposed to a certain kind of work and responsibility. Not to mention that respecting many sensitives and view-points is necessary. It also means having the guts to tell people “that’s not how it’s done here” when it needs to be said. Personally it’s a kind of work I’ve grown to be passionate about. I truly believe having a great culture is at the core of community (and really anything good) and can be a complex balancing act of competing factors and emotions. It’s certainly not the kind of thing everyone is into, but if you are, the WG is looking for more diversity, if nothing else it’s tending slightly older at the moment.

Having said that: Within all of the open source communities from local to international levels there’s always space for people who are willing to turn up and help!

Join your local community! Find the parts of community that “speak” to you. Maybe it’s starting a meetup, helping your local conference, running a DjangoGirls. Maybe it’s something engineer-related like finally adding something to an open source library that you’re into, adding some beginner docs somewhere, or engaging with Djangonaut Space. Maybe it’s something online like helping out in forum.djangoproject.com, Reddit or Discord.

As organisers we have this cheat code for finding new people to invite to help more, it’s called “looking for chair-stackers”, that is people who engage to help in the little ways, such as helping stack chairs at the end of an event or generally pack down, wipe up, carry boxes or put things away. Or online: people who go out of their way to try to understand and chip in to manage extra rules, or answer the unanswered thing that’s been sitting there for a while. Or people who just ask “can I help out with that?” when the organisers seem tired or stressed out. Having people around who help in these ways has huge value and has been the beginning of many people being involved in communities and making life-long friends and connections.

Now more than ever though, it’s so important to connect to your community. We are stronger, better and healthier when we are connected to and relied on by other people and we have others we can share our experiences with.

Particularly us computer people tend not to be as good with connecting with other people, but everyone should find their way to get out and connect! It’s sometimes hard but it’s always better.

You have organized many DjangoGirls in Australia, how did you start? Do you have any advice for someone who would like to organize a DjangoGirls event?

In 2014 I was living in Perth, Australia, where Russell Keith Magee is based and we had a budding Python/Django User Group. At one of the meetings news emerged about how Ola and Ola were running this thing called “DjangoGirls” at EuroPython in a few weeks. PyConAU was scheduled a couple of weeks after this. I was like, that’s a great idea, I can absolutely have a go at doing that and emailed them immediately asking if I could copy their materials and plan. We pulled it together with an amazing bunch of people and I think this was technically the 2nd DjangoGirls event ever. In the following years I’ve been involved in many more, including the first North American DjangoGirls. From our Perth series of events a successful organisation was spun off called SheCodes.

In the more-than-a-decade since then the world has changed so much! Particularly in the tech world. I would say specifically for DjangoGirls events, they are very region specific. My first advice for organising an event in your region is to see if there’s been one previously and reach out to the event organisers, or at least the nearest organisers – I think these days there are few places on earth that haven’t had a DjangoGirls event nearish-by. The resources on the website are actually great for getting going and the international DjangoGirls team are lovely, but also always looking for more help.

Where I live now, back in the capital, Canberra, we are very well supported for education services. We held a DjangoGirls event a couple of years ago, but for the attendees what emerged was that what we really wanted was just to connect with other technical women.

Now what has been very successful for us is an ongoing PyLadies/Women’s Software group who meet up regularly and talk about things that matter to our experience. We use the “lean-coffee” model and it’s been unexpectedly functional. This has been one of the best groups I’ve ever been in with a range of technical women regularly sharing our weird and statistically unusual experiences together, it feeds the soul, and is strongly recommended if you don’t participate in a group like this already.

Is there anything else you’d like to say?

A final shout out to the original leaders of the Django community, for me personally Russell, Jeff, Jacob, Andrew and Baptiste in particular, but everyone who has persisted over the years in just turning up over the long haul and keeping our part of the world as beautiful as can be. My friends Dave, Matt and Jonah. Thibaud is a great president right now. Rarely is there a dedicated Django person who is not absolutely delightful and I feel both proud and honoured to be part of this community. A big thank you to everyone (especially you Sarah! And all the Sarahs, Natalias, Lillys and Olas) who help to make Django what it is.


Thank you for doing the interview, Elena !

June 17, 2025 05:09 PM UTC


Python Insider

Python 3.14.0 beta 3 is here!

It’s 3.14 beta 3!

https://www.python.org/downloads/release/python-3140b3/

This is a beta preview of Python 3.14

Python 3.14 is still in development. This release, 3.14.0b3, is the third of four planned beta releases.

Beta release previews are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release.

We strongly encourage maintainers of third-party Python projects to test with 3.14 during the beta phase and report issues found to the Python bug tracker as soon as possible. While the release is planned to be feature-complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (Tuesday 2025-07-22). Our goal is to have no ABI changes after beta 4 and as few code changes as possible after the first release candidate. To achieve that, it will be extremely important to get as much exposure for 3.14 as possible during the beta phase.

This includes creating pre-release wheels for 3.14, as it helps other projects to do their own testing. However, we recommend that your regular production releases wait until 3.14.0rc1, to avoid the risk of ABI breaks.

Please keep in mind that this is a preview release and its use is not recommended for production environments.

Major new features of the 3.14 series, compared to 3.13

Some of the major new features and changes in Python 3.14 are:

New features

Note that PEPs 734 and 779 are exceptionally new in beta 3!

(Hey, fellow core developer, if a feature you find important is missing from this list, let Hugo know.)

For more details on the changes to Python 3.14, see What’s new in Python 3.14. The next pre-release of Python 3.14 will be the final beta, 3.14.0b4, scheduled for 2025-07-08.

Build changes

Incompatible changes, removals and new deprecations

Python install manager

The installer we offer for Windows is being replaced by our new install manager, which can be installed from the Windows Store or our FTP page. See our documentation for more information. The JSON file available for download below contains the list of all the installable packages available as part of this release, including file URLs and hashes, but is not required to install the latest release. The traditional installer will remain available throughout the 3.14 and 3.15 releases.

More resources

And now for something completely different

If you’re heading out to sea, remember the Maritime Approximation:

π mph = e knots

Enjoy the new release

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organisation contributions to the Python Software Foundation.

Regards from sunny Helsinki with 19 hours of daylight,

Your release team,
Hugo van Kemenade
Ned Deily
Steve Dower
Łukasz Langa

June 17, 2025 02:43 PM UTC


Real Python

Exploring Python's list Data Type With Examples

The list class is a fundamental built-in data type in Python. It has an impressive and useful set of features, allowing you to efficiently organize and manipulate heterogeneous data. Knowing how to use lists is a must-have skill for you as a Python developer. Lists have many use cases, so you’ll frequently reach for them in real-world coding.

By working through this video course, you’ll dive deep into lists and get a solid understanding of their key features. This knowledge will allow you to write more effective code by taking advantage of lists.

In this video course, you’ll learn how to:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 17, 2025 02:00 PM UTC


Mike Driscoll

Python 201 – All About the TypedDict

Python has supported the concept of type hinting for quite a while now. However, unlike other programming languages, Python does not enforce type hints. You must use an external tool, such as Mypy, for that.

In this tutorial, you will learn all about TypedDict, a special way of adding type hinting to Heterogeneous dictionaries. A heterogeneous dictionary is a dictionary that has values that are not all the same type.

But before you learn how to use the TypedDict, you should review how to type hint a regular dictionary.

Type Hinting a Regular Dictionary

A regular Python dictionary is defined as follows:

my_dictionary = {"some_key": "some_value"}

You can use any hashable type for the key, such as a string or an integer. The value of a dictionary can be any type whatsoever.

When you want to type hint a dictionary, you would use the following: dict[key_type, value_type]

Now let’s apply that to the example above:

my_dictionary: dict[str, str] = {"some_key": "some_value"}

If you are using a version of Python before 3.9, you will need to do the following instead:

from typing import Dict

my_dictionary: Dict[str, str] = {"some_key": "some_value"}

Fortunately, modern Python no longer requires that extra import.

Now you’re ready to learn about how and why you might want to use the TypedDict

Creating a TypedDict

The TypedDict was introduced to Python in 3.8. You can read the full details about it in PEP 589. The reason you would use a TypedDict over a regular dictionary is when you have a dictionary with values of different types.

Here’s an example:

my_dictionary = {"names": ["Mike", "Andrea", "John"],
                 "type": "employee",
                 "code": 123456
                }

Type hinting this type of dictionary is more complex. You can do something like this, though:

my_dictionary: dict[str, list | str | int] = {"names": ["Mike", "Andrea", "John"], "otype": "employee", "code": 123456 }

Depending on how your type checker is configured, this might work. However, if you write code that modifies the list, your type checker may complain that a string doesn’t have an append method or vice versa.

To make the type checker happier, you should use a TypedDict.

Here’s how you would use one with this example:

from typing import TypedDict

class MultiTypeDict(TypedDict):
    names: list
    otype: str
    code: int

my_dictionary: MultiTypeDict = {"names": ["Mike", "Andrea", "John"], "otype": "employee", "code": 123456 }

Isn’t that great? There’s just one problem. What if your dictionary’s keys have spaces in them? You cannot create class attributes with spaces!

There’s a workaround for that. Check it out in the next section.

Creating a TypedDict with Keys that Have Spaces

For this example, you will create a new dictionary with four keys, three of which contain spaces.

To make a TypedDict for this type of dictionary, you need to call the TypedDict constructor instead of subclassing it:

from typing import TypedDict

Results = TypedDict("Results",{"Animal Habitats": list,
                               "Tested": bool,
                               "Animal Name": str,
                               "Animal Location": str})

actual_results: Results = {
    "Animal Habitats": ["Asia", "N. America"],
    "Tested": False,
    "Animal Name": "Tigris",
    "Animal Location": "North Bay",
}

When you call TypedDict, you pass in the typename (what you would have named the class) and the fields the dictionary should have. You’ll note that the fields are a dictionary. This is where you will put the keys that contain spaces and those without spaces.

Give it a try and you’ll find it works great!

Wrapping Up

TypedDict is a handy tool for storing a complex dictionary. You will find that sometimes you even have these complex dictionaries inside of lists, tuples or even other dictionaries. Using the TypedDict can make type-hinting these data structures easier and prevent hard-to-detect defects from creeping in.

The post Python 201 – All About the TypedDict appeared first on Mouse Vs Python.

June 17, 2025 01:25 PM UTC


Armin Ronacher

We Can Just Measure Things

This week I spent time with friends to letting agents go wild and see what we could build in 24 hours. I took some notes for myself to reflect on that experience. I won't bore you with another vibecoding post, but you can read Peter's post about how that went.

As fun as it was, it also was frustrating in other ways and in entire predictable ways. It became a meme about how much I hated working with Xcode for this project. This got me thinking quite a bit more that this has been an entirely unacceptable experience for a long time, but with programming agents, the pain becomes measurable.

When I first dove into programming I found the idea of RTFM quite hilarious. “Why are you asking dumb questions, just read it up.” The unfortunate reality is that the manual often doesn't exist — or is wrong. In fact, we as engineers are quite willing to subject each others to completely inadequate tooling, bad or missing documentation and ridiculous API footguns all the time. “User error” is what we used to call this, nowadays it's a “skill issue”. It puts the blame on the user and absolves the creator, at least momentarily. For APIs it can be random crashes if you use a function wrong, for programs it can be impossible to navigate UI or lack of error messages. There are many different ways in which we humans get stuck.

What agents change about this is, is that I can subject them to something I wouldn't really want to subject other developers to: measuring. I picked the language for my current project by running basic evals and it worked well. I learned from that, that there are objectively better and worse language when it comes to my particular problem. The choice however is not just how much the AI knows about the language from the corpus of examples during training. It's also tooling, the inherent capabilities of the language, ecosystem churn and other aspects.

Using agents to measure code quality is great because agents don't judge me, but they do judge the code they are writing. Not all agents will swear, but they will express frustration with libraries when loops don't go well or give up. That opens up an opportunity to bring some measurements into not agent performance, but the health of a project.

We should pay more attention to how healthy engineering teams are, and that starts with the code base. Using agents we can put some numbers to it in which we cannot do with humans (or in a very slow and expensive way). We can figure out how successful agents are in using the things are are creating in rather objective ways which is in many ways a proxy for how humans experience working with the code. Getting together with fresh souls to walk them through a tutorial or some tasks is laborious and expensive. Getting agents that have never seen a codebase start using a library is repeatable, rather cheap, fast and if set up the right way very objective. It also takes the emotion out of it or running the experiment multiple times.

Now obviously we can have debates over if the type of code we would write with an agent is objectively beautiful or if the way agents execute tools creates the right type of tools. This is a debate worth having. Right at this very moment though what programming agents need to be successful is rather well aligned with what humans need.

So what works better than other things? For now these are basic indicators, for agents and humans alike:

When an agent struggles, so does a human. There is a lot of code and tooling out there which is objectively not good, but because of one reason or another became dominant. If you want to start paying attention to technology choices or you want to start writing your own libraries, now you can use agents to evaluate the developer experience.

Because so can your users. I can confidently say it's not just me that does not like Xcode, my agent also expresses frustration — measurably so.

June 17, 2025 12:00 AM UTC

June 16, 2025


Python Engineering at Microsoft

Python in Visual Studio Code – June 2025 Release

We’re excited to announce the June 2025 release of the Python, Pylance and Jupyter extensions for Visual Studio Code!

This release includes the following announcements:

If you’re interested, you can check the full list of improvements in our changelogs for the Python, Jupyter and Pylance extensions.

Python chat tools

The Python extension now includes the following chat tools: “Get information for a Python Environment”, “Get executable information for a Python Environment”, “Install Python Package” and “Configure Python Environment”. You can either directly reference them in your prompt by adding #getPythonEnvironmentInfo and #installPythonPackage, or agent mode will automatically call the tool as applicable based on your prompt. These tools seamlessly detect appropriate environment information, based on file or workspace context, and handle package installation with accurate environment resolution.

The “Configure Python Environment” tool ensures that the Python environment is set up correctly for the workspace. This includes creating a virtual environment if needed, and selecting it as the active Python environment for your workspace.

Tools that were previously introduced in the Python Environments extension (preview) have been migrated to the Python extension, thereby making these tools available to all users with the Python extension installed.

Language Server based terminal suggest in the Python REPL

Language server completions are now available in the terminal for interactive Python REPL sessions. This brings the same language completions you receive in the editor, now inside the terminal making terminal interactions more efficient.

To try it out, ensure the following settings are enabled:

Create Project from a template in the Python Environments extension

The Python Environments extension (preview) now supports project creation for Python packages and basic scripts, allowing you to bypass scaffolding and get coding more quickly. Use the Python Envs: Create Project from Template command in the Command Palette to select whether you want to create a package or a script and let the command handle the rest!

For package creation, you can expect to name the package, create a virtual environment, and receive a scaffolded project which includes a tests subfolder, pyproject.toml, dev-requirements.txt, and boilerplate __main__.py and __init__.py files.

For scripts, a new Python file with the name of your choice and boilerplate code will be created.

PyEnv and Poetry support in the Python Environments extension

We added support for pyenv for environment management, and poetry for both package and environment management in the Python Environments extension. This ensures you can manage pyenv and poetry environments as your normally would in the support UI contributed by the Python Environments extension. When pyenv or poetry are installed on your machine, they will appear as support environment managers in the Python panel accessed in the Activity Bar.

Screenshot showing various environment managers in the Python environments view.

Controlled rollout of the Python Environments extension

We’re starting to roll-out the Python Environments extension as an optional dependency with the Python extension beginning with a subset of pre-release users this month. What this means is you may now begin seeing the Python Environments extension automatically installed along side the Python extension, similar to the Python Debugger and Pylance extensions. This controlled rollout allows us to gather early feedback and ensure reliability before general availability. The Python Environments extension includes all the core capabilities we’ve introduced so far including: Quick Create for one-click environment setup using Quick Create, automatic terminal activation (via "python-envs.terminal.autoActivationType" setting), and all supported UI for environment an package management.

You can install the preview version of the Python Environments extension from the Extension Marketplace if you would like to try it out. Please let us know if there are any issues or feature requests via our vscode-python-environments repo.

We would also like to extend special thanks to this month’s contributors:

Try out these new improvements by downloading the Python extension and the Jupyter extension from the Marketplace, or install them directly from the extensions view in Visual Studio Code (Ctrl + Shift + X or ⌘ + ⇧ + X). You can learn more about Python support in Visual Studio Code in the documentation. If you run into any problems or have suggestions, please file an issue on the Python VS Code GitHub page.

The post Python in Visual Studio Code – June 2025 Release appeared first on Microsoft for Python Developers Blog.

June 16, 2025 04:38 PM UTC


Real Python

Write Pythonic and Clean Code With namedtuple

Python’s namedtuple in the collections module allows you to create immutable sequences with named fields, providing a more readable and Pythonic way to handle tuples. You use namedtuple to access values with descriptive field names and dot notation, which improves code clarity and maintainability.

By the end of this tutorial, you’ll understand that:

  • Python’s namedtuple is a factory function that creates tuple subclasses with named fields.
  • The main difference between tuple and namedtuple is that namedtuple allows attribute access via named fields, enhancing readability.
  • The point of using namedtuple is to improve code clarity by allowing access to elements through descriptive names instead of integer indices.
  • Some alternatives to namedtuple include dictionaries, data classes, and typing.NamedTuple.

Dive deeper into creating namedtuple classes, exploring their powerful features, and writing Python code that’s easier to read and maintain.

Get Your Code: Click here to download the free sample code that shows you how to use namedtuple to write Pythonic and clean code.

Take the Quiz: Test your knowledge with our interactive “Write Pythonic and Clean Code With namedtuple” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Write Pythonic and Clean Code With namedtuple

In this quiz, you'll test your understanding of Python's namedtuple() factory function from the collections module.

Getting to Know namedtuple in Python

Python’s namedtuple() is a factory function that’s available in the collections module. It allows you to create a tuple subclass with named fields. These named fields let you to access the values in a given named tuple using dot notation and field names—for example, my_tuple.field_name.

Python’s namedtuple was created to improve code readability by providing a way to access values using descriptive field names instead of integer indices, which often don’t provide any context on what the values are. This feature also makes the code cleaner and more maintainable.

In contrast, accessing values by index in a regular tuple can be frustrating, hard to read, and error-prone. This is especially true if the tuple has a lot of fields and is constructed far away from where you’re using it.

Note: In this tutorial, you’ll find different terms used to refer to Python’s namedtuple, its factory function, and its instances.

To avoid confusion, here’s a summary of how each term is used throughout the tutorial:

Term Meaning
namedtuple() The factory function
namedtuple, namedtuple class The tuple subclass returned by namedtuple()
namedtuple instance, named tuple An instance of a specific namedtuple class

You’ll find these terms used with their corresponding meaning throughout the tutorial.

Besides providing named fields, named tuples in Python offer the following features:

You can use namedtuple instances wherever you need a tuple-like object. They offer the added benefit of accessing values using field names and dot notation, which makes your code more readable and Pythonic.

With this brief introduction to namedtuple and its general features, you’re ready to explore how to create and use them effectively in your own code.

Creating Tuple-Like Classes With the namedtuple() Function

You use a namedtuple() to create an immutable, tuple-like sequence with named fields. A popular example that you’ll often find in resources about namedtuple is defining a class to represent a mathematical point.

Depending on the problem, you’ll probably want to use an immutable data structure to represent your points. Here’s how you can create a two-dimensional point using a regular tuple:

Python
>>> # Create a 2D point as a regular tuple
>>> point = (2, 4)
>>> point
(2, 4)

>>> # Access coordinate x
>>> point[0]
2
>>> # Access coordinate y
>>> point[1]
4

>>> # Try to update a coordinate value
>>> point[0] = 100
Traceback (most recent call last):
    ...
TypeError: 'tuple' object does not support item assignment
Copied!

In this example, you create an immutable, two-dimensional point using a regular tuple. This code works. You have a point with two coordinates that you can access by index. The point is immutable, so you can’t modify the coordinates. However, do you think this code is readable? Can you tell upfront what the 0 and 1 indices mean?

To improve clarity, you can use a namedtuple like in the following code. Note that you need to import the function from the collections module first:

Python
>>> from collections import namedtuple

>>> # Create a namedtuple type, Point
>>> Point = namedtuple("Point", "x y")

>>> point = Point(2, 4)
>>> point
Point(x=2, y=4)

>>> # Access the coordinates by field name
>>> point.x
2
>>> point.y
4

>>> # Access the coordinates by index
>>> point[0]
2
>>> point[1]
4

>>> point.x = 100
Traceback (most recent call last):
    ...
AttributeError: can't set attribute

>>> issubclass(Point, tuple)
True
Copied!

Now you have a Point class with two appropriately named fields, .x and .y. Your point provides a descriptive string representation by default: Point(x=2, y=4).

You can access the coordinates with dot notation and the field names, which is convenient, readable, and explicit. You can also use indices to access each coordinate’s value if you prefer.

Note: As with regular tuples, named tuples are immutable. However, the values they store don’t necessarily have to be immutable.

It’s completely valid to create a tuple or a named tuple that holds mutable values:

Python
>>> from collections import namedtuple

>>> Person = namedtuple("Person", "name children")
>>> john = Person("John Doe", ["Timmy", "Jimmy"])
>>> john
Person(name='John Doe', children=['Timmy', 'Jimmy'])
>>> id(john.children)
139695902374144

>>> john.children.append("Tina")
>>> john
Person(name='John Doe', children=['Timmy', 'Jimmy', 'Tina'])
>>> id(john.children)
139695902374144

>>> hash(john)
Traceback (most recent call last):
    ...
TypeError: unhashable type: 'list'
Copied!

You can create named tuples that contain mutable objects. Then, you can modify the mutable objects in the underlying tuple. However, this doesn’t mean that you’re modifying the tuple itself. The tuple will continue being the same object.

Finally, tuples or named tuples with mutable values aren’t hashable, as you saw in the above example.

Read the full article at https://realpython.com/python-namedtuple/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 16, 2025 02:00 PM UTC


Python Bytes

#436 Slow tests go last

<strong>Topics covered in this episode:</strong><br> <ul> <li><em>* Free-threaded Python no longer “experimental” as of Python 3.14</em>*</li> <li><strong><a href="https://github.com/livingbio/typed-ffmpeg?featured_on=pythonbytes">typed-ffmpeg</a></strong></li> <li><strong><a href="https://github.com/deepankarm/pyleak?featured_on=pythonbytes">pyleak</a></strong></li> <li><em>* <a href="https://timonweb.com/django/optimizing-test-execution-running-live_server-tests-last-with-pytest/?featured_on=pythonbytes">Optimizing Test Execution: Running live_server Tests Last with pytest</a></em>*</li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=Mt7X3Q54lU4' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="436">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by <strong>PropelAuth</strong>: <a href="https://pythonbytes.fm/propelauth66">pythonbytes.fm/propelauth66</a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky)</li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Brian #1: Free-threaded Python no longer “experimental” as of Python 3.14</strong></p> <ul> <li>“PEP 779 ("Criteria for supported status for free-threaded Python") has been accepted, which means free-threaded Python is now a supported build!” <a href="https://fosstodon.org/@hugovk@mastodon.social/114689715316210829">- Hugo van Kemenade</a></li> <li><a href="https://peps.python.org/pep-0779/?featured_on=pythonbytes">PEP 779 – Criteria for supported status for free-threaded Python</a></li> <li>As noted in the <a href="https://discuss.python.org/t/pep-779-criteria-for-supported-status-for-free-threaded-python/84319/123?featured_on=pythonbytes">discussion of PEP 779</a>, “The Steering Council (SC) approves PEP 779, with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14.”</li> <li>We are in Phase II then.</li> <li>“We are confident that the project is on the right path, and we appreciate the continued dedication from everyone working to make free-threading ready for broader adoption across the Python community.”</li> <li>“Keep in mind that any decision to transition to Phase III, with free-threading as the default or sole build of Python is still undecided, and dependent on many factors both within CPython itself and the community. We leave that decision for the future.”</li> <li>How long will all this take? According to Thomas Wouters, <a href="https://social.coop/@Yhg1s/114692495471337607?featured_on=pythonbytes">a few years, at least</a>: “In other words: it'll be a few years at least. It can't happen before 3.16 (because we won't have Stable ABI support until 15) and may well take longer.”</li> </ul> <p><strong>Michael #2:</strong> <a href="https://github.com/livingbio/typed-ffmpeg?featured_on=pythonbytes">typed-ffmpeg</a></p> <ul> <li><p><strong>typed-ffmpeg</strong> offers a modern, Pythonic interface to FFmpeg, providing extensive support for complex filters with detailed typing and documentation.</p></li> <li><p>Inspired by ffmpeg-python, this package enhances functionality by addressing common limitations, such as lack of IDE integration and comprehensive typing, while also introducing new features like JSON serialization of filter graphs and automatic FFmpeg validation.</p></li> <li><p>Features :</p> <ul> <li><strong>Zero Dependencies:</strong> Built purely with the Python standard library, ensuring maximum compatibility and security.</li> <li><strong>User-Friendly:</strong> Simplifies the construction of filter graphs with an intuitive Pythonic interface.</li> <li><strong>Comprehensive FFmpeg Filter Support:</strong> Out-of-the-box support for most FFmpeg filters, with IDE auto-completion.</li> <li><strong>Integrated Documentation:</strong> In-line docstrings provide immediate reference for filter usage, reducing the need to consult external documentation.</li> <li><strong>Robust Typing:</strong> Offers static and dynamic type checking, enhancing code reliability and development experience.</li> <li><strong>Filter Graph Serialization:</strong> Enables saving and reloading of filter graphs in JSON format for ease of use and repeatability.</li> <li><strong>Graph Visualization:</strong> Leverages graphviz for visual representation, aiding in understanding and debugging.</li> <li><strong>Validation and Auto-correction:</strong> Assists in identifying and fixing errors within filter graphs.</li> <li><strong>Input and Output Options Support:</strong> Provide a more comprehensive interface for input and output options, including support for additional codecs and formats.</li> <li><strong>Partial Evaluation:</strong> Enhance the flexibility of filter graphs by enabling partial evaluation, allowing for modular construction and reuse.</li> <li><strong>Media File Analysis:</strong> Built-in support for analyzing media files using FFmpeg's ffprobe utility, providing detailed metadata extraction with both dictionary and dataclass interfaces.</li> </ul></li> </ul> <p><strong>Michael #3:</strong> <a href="https://github.com/deepankarm/pyleak?featured_on=pythonbytes">pyleak</a></p> <ul> <li>Detect leaked asyncio tasks, threads, and event loop blocking with stack trace in Python. Inspired by goleak.</li> <li>Use as context managers or function dectorators</li> <li>When using no_task_leaks, you get detailed stack trace information showing exactly where leaked tasks are executing and where they were created.</li> <li>Even has great examples and a pytest plugin.</li> </ul> <p><strong>Brian #4: <a href="https://timonweb.com/django/optimizing-test-execution-running-live_server-tests-last-with-pytest/?featured_on=pythonbytes">Optimizing Test Execution: Running live_server Tests Last with pytest</a></strong></p> <ul> <li><p>Tim Kamanin</p></li> <li><p>“When working with <strong>Django</strong> applications, it's common to have a mix of fast unit tests and slower end-to-end (E2E) tests that use p<strong>ytest</strong>'s <code>live_server</code> fixture and browser automation tools like <strong>Playwright</strong> or <strong>Selenium</strong>. ”</p></li> <li><p>Tim is running E2E tests last for</p> <ul> <li>Faster feedback from quick tests</li> <li>To not tie up resources early in the test suite.</li> </ul></li> <li><p>He did this with</p> <ul> <li><p>custom “e2e” marker</p></li> <li><p>Implementing a </p> <pre><code>pytest_collection_modifyitems </code></pre> <p>hook function to look for tests using the </p> <pre><code>live_server </code></pre> <p>fixture, and for them</p> <ul> <li>automatically add the <code>e2e</code> marker to those tests</li> <li>move those tests to the end</li> </ul></li> </ul></li> <li><p>The reason for the marker is to be able to</p> <ul> <li>Just run e2e tests with <code>-m e2e</code></li> <li>Avoid running them sometimes with <code>-m "not e2e"</code></li> </ul></li> <li><p>Cool small writeup.</p> <ul> <li>The technique works for any system that has some tests that are slower or resource bound based on a particular fixture or set of fixtures.</li> </ul></li> </ul> <p><strong>Extras</strong></p> <p>Brian:</p> <ul> <li><a href="https://discuss.python.org/t/is-free-threading-our-only-option/91775?featured_on=pythonbytes">Is Free-Threading Our Only Option?</a> - Interesting discussion started by Eric Snow and recommended by John Hagen</li> <li><a href="https://hugovk.dev/blog/2025/free-threaded-python-on-github-actions/?featured_on=pythonbytes">Free-threaded Python on GitHub Actions</a> - How to add FT tests to your projects, by Hugo van Kemenade</li> </ul> <p>Michael:</p> <ul> <li>New course! <a href="https://training.talkpython.fm/courses/llm-building-blocks-for-python?featured_on=pythonbytes">LLM Building Blocks in Python</a></li> <li><a href="https://talkpython.fm/blog/posts/deep-dive-retrospective-at-talk-python/?featured_on=pythonbytes">Talk Python Deep Dives Complete: 600K Words of Talk Python Insights</a></li> <li>.folders on Linux <ul> <li>Write up on <a href="https://blobs.pythonbytes.fm/xdg-config-home-v2.html">XDG for Python devs</a>.</li> </ul></li> <li><a href="https://blobs.pythonbytes.fm/keep-pulling-back-in.jpg">They keep pulling me back</a> - <a href="https://help.openai.com/en/articles/9624314-model-release-notes?featured_on=pythonbytes">ChatGPT Pro with o3-pro</a></li> <li>Python Bytes is the <a href="https://goodpods.com/leaderboard/top-100-shows-by-category/news/tech-news?period=month#67232899">#1 Python news podcast and #17 of all tech news podcasts</a>.</li> <li><a href="https://pythoninsider.blogspot.com/2025/06/python-3134-31211-31113-31018-and-3923.html?featured_on=pythonbytes">Python 3.13.4, 3.12.11, 3.11.13, 3.10.18 and 3.9.23 are now available</a></li> <li><a href="https://pythoninsider.blogspot.com/2025/06/python-3135-is-now-available.html?featured_on=pythonbytes">Python 3.13.5 is now available!</a></li> </ul> <p><strong>Joke:</strong> <a href="https://x.com/PR0GRAMMERHUM0R/status/1930655881718382721?featured_on=pythonbytes">Naming is hard</a></p>

June 16, 2025 08:00 AM UTC


Ned Batchelder

Math factoid of the day: 63

63 is a centered octahedral number. That means if you build an approximation of an octahedron with cubes, one size of octahedron will have 63 cubes.

In the late 1700’s René Just Haüy developed a theory about how crystals formed: successive layers of fundamental primitives in orderly arrangements. One of those arrangements was stacking cubes together to make an octahedron.

Start with one cube:

Just one lonely cube

Add six more cubes around it, one on each face. Now we have seven:

Seven cubes as a crude octahedron

Add another layer, adding a cube to touch each visible cube, making 25:

25 cubes arranged like an octahedron five cubes wide

One more layer and we have a total of 63:

63 cubes arranged like an octahedron seven cubes wide

The remaining numbers in the sequence less than 10,000 are 129, 231, 377, 575, 833, 1159, 1561, 2047, 2625, 3303, 4089, 4991, 6017, 7175, 8473, 9919.

63 also shows up in the Delannoy numbers: the number of ways to traverse a grid from the lower left corner to upper right using only steps north, east, or northeast. Here are the 63 ways of moving on a 3×3 grid:

63 different ways to traverse a 3x3 grid

(Diagram from Wikipedia)

In fact, the number of cubes in a Haüy octahedron with N layers is the same as the number of Delannoy steps on a 3×N grid!

Since the two ideas are both geometric and fairly simple, I would love to find a geometric explanation for the correspondence. The octahedron is three-dimensional, and the Delannoy grids have that tantalizing 3 in them. It seems like there should be a way to convert Haüy coordinates to Delannoy coordinates to show how they relate. But I haven’t found one...

•    •    •

Colophon: I made the octahedron diagrams by asking Claude to write a Python program to do it. It wasn’t a fast process because it took pushing and prodding to get the diagrams to come out the way I liked. But Claude was very competent, and I could think about the results rather than about projections or color spaces. I could dip into it for 10 minutes at a time over a number of days without having to somehow reconstruct a mental context.

This kind of casual hobby programming is perfect for AI assistance. I don’t need the code to be perfect or even good, I just want the diagrams to be nice. I don’t have the focus time to learn how to write the program, so I can leave it to an imperfect assistant.

June 16, 2025 04:00 AM UTC

June 15, 2025


Ed Crewe

Talk about Cloud Prices at PyConLT 2025


Introduction to Cloud Pricing

I am looking forward to speaking at PyConLT 2025
My talk is called Cutting the Price of Scraping Cloud Costs (video)

Its been a while (12 years!) since my last Python conference EuroPython Florence 2012, when I spoke as a Django web developer, although I did give a Golang talk at Kubecon USA last year.

I work at EDB, the Postgres company, on our Postgres AI product. The cloud version of which runs across the main cloud providers, AWS, Azure and GCP.

The team I am in handles the identity management and billing components of the product. So whilst I am mainly a Golang micro-service developer, I have dipped my toe into Data Science, having rewritten our Cloud prices ETL using Python & Airflow. The subject of my talk in Lithuania.

Cloud pricing can be surprisingly complex ... and the price lists are not small.

The full price lists for the 3 CSPs together are almost 5 million prices - known as SKUs (Stock Keeping Unit prices)

csp x service x type x tier x region
3    x  200      x 50     x 3     x 50        = 4.5 million

csp = AWS, Azure and GCP

service = vms, k8s, network, load balancer, storage etc.

type = e.g. storage - general purpose E2, N1 ... accelerated A1, A2  multiplied by various property sizes

tier  = T-shirt size tiers of usage, ie more use = cheaper rate - small, medium, large

region = us-east-1, us-west-2, af-south-1, etc.

We need to gather all the latest service SKU that our Postgres AI may use and total them up as a cost estimate for when customers are selecting the various options for creating or adding to their installation.
Applying the additional pricing for our product and any private offer discounts for it, as part of this process.

Therefore we needed to build a data pipeline to gather the SKUs and keep them current.

Previously we used a 3rd party kubecost based provider's data, however our usage was not sufficient to justify for paying for this particular cloud service when its free usage expired.

Hence we needed to rewrite our cloud pricing data pipeline. This pipeline is in Apache Airflow but it could equally be in Dagster or any other data pipeline framework.

My talk deals with the wider points around cloud pricing, refactoring a data pipeline and pipeline framework options. But here I want to provide more detail on the data pipeline's Python code, its use of Embedded Postgres and Click, and the benefits for development and testing.  Some things I didn't have room for in the talk.


Outline of our use of Data Pipelines

Airflow, Dagster, etc. provide many tools for pipeline development.
Notably local development mode for running up the pipeline framework locally and doing test runs.
Including some reloading on edit, it can still be a long process, running up a pipeline and then executing the full set of steps, known as a directed acyclic graph, DAG.

One way to improve the DEVX is if the DAG step's code is encapsulated as much as possible per step.
Removing use of shared state where that is viable and allowing individual steps to be separately tested, rapidly, with fixture data. With fast stand up and tear down, of temporary embedded storage.

To avoid shared state persistence across the whole pipeline we use extract transform load (ETL) within each step, rather than across the whole pipeline. This enables functional running and testing of individual steps outside the pipeline.


The Scraper Class

We need a standard scraper class to fetch the cloud prices from each CSP so use an abstract base class.


from abc import ABC

class BaseScraper(ABC):

   """Abstract base class for Scrapers"""

   batch = 500

   conn = None

   unit_map = {"FAIL": ""}

   root_url = ""


   def map_units(self, entry, key):

       """To standardize naming of units between CSPs"""

       return self.unit_map.get(entry.get(key, "FAIL"), entry[key])


   def scrape_sku(self):

       """Scrapes prices from CSP bulk JSON API - uses CSP specific methods"""

       Pass


   def bulk_insert_rows(self, rows):

       """Bulk insert batches of rows - Note that Psycopg >= 3.1 uses pipeline mode"""

       query = """INSERT INTO api_price.infra_price VALUES

       (%(sku_id)s, %(cloud_provider)s, %(region)s, %(sku_name)s, %(end_usage_amount)s)"""

       with self.conn.cursor() as cur:

           cur.executemany(query, rows)


This has 3 common methods:

  1. mapping units to common ones across all CSP
  2. Top level scrape sku methods some CSP differences within sub methods called from it
  3. Bulk insert rows - the main concrete method used by all scrapers

To bulk insert 500 rows per query we use Psycopg 3 pipeline mode - so it can send batch updates again and again without waiting for response.

The database update against local embedded Postgres is faster than the time to scrape the remote web site SKUs.


The largest part of the Extract is done at this point. Rather than loading all 5 million SKU as we did with the kubecost data dump, to query out the 120 thousand for our product. Scraping the sources directly we only need to ingest those 120k SKU. Which saves handling 97.6% of the data!


So the resultant speed is sufficient although not as performant as pg_dump loading which uses COPY.


Unfortunately Python Psycopg is significantly slower when using cursor.copy and it mitigated against using zipped up Postgres dumps. Hence all the data artefact creation and loading simply uses the pg_dump utility wrapped as a Python shell command. 

There is no need to use Python here when there is the tried and tested C based pg_dump utility for it that ensures compatibility outside our pipeline. Later version pg_dump can always handle earlier Postgres dumps.


We don't need to retain a long history of artefacts, since it is public data and never needs to be reverted.

This allows us a low retention level, cleaning out most of the old dumps on creation of a new one. So any storage saving on compression is negligible.

Therefore we avoid pg_dump compression, since it can be significantly slower, especially if the data already contains compressed blobs. Plain SQL COPY also allows for data inspection if required - eg grep for a SKU, when debugging why a price may be missing.


Postgres Embedded wrapped with Go

Unlike MySQL, Postgres doesn't do in memory databases. The equivalent for temporary or test run database lifetime, is the embedded version of Postgres. Run from an auto-created temp folder of files. 
Python doesn’t have maintained wrapper for Embedded Postgres, sadly project https://github.com/Simulmedia/pyembedpg is abandoned 😢

Hence use the most up to date wrapper from Go. Running the Go binary via a Python shell command.
It still lags behind by a version of Postgres, so its on Postgres 16 rather than latest 17.
But for the purposes of embedded use that is irrelevant.

By using separate temporary Postgres per step we can save a dumped SQL artefact at the end of a step and need no data dependency between steps, meaning individual step retry in parallel, just works.
The performance of localhost dump to socket is also superior.
By processing everything in the same (if embedded) version of our final target database as the Cloud Price, Go micro-service, we remove any SQL compatibility issues and ensure full Postgresql functionality is available.

The final data artefacts will be loaded to a Postgres cluster price schema micro-service running on CloudNativePG

Use a Click wrapper with Tests

The click package provides all the functionality for our pipeline..

> pscraper -h

Usage: pscraper [OPTIONS] COMMAND [ARGS]...

   price-scraper: python web scraping of CSP prices for api-price

Options:

  -h, --help  Show this message and exit.


Commands:

  awsscrape     Scrape prices from AWS

  azurescrape  Scrape prices from Azure

  delold            Delete old blob storage files, default all over 12 weeks old are deleted

  gcpscrape     Scrape prices from GCP - set env GCP_BILLING_KEY

  pgdump        Dump postgres file and upload to cloud storage - set env STORAGE_KEY
                      > pscraper pgdump --port 5377 --file price.sql 

  pgembed      Run up local embeddedPG on a random port for tests

> pscraper pgembed

  pgload           Load schema to local embedded postgres for testing

> pscraper pgload --port 5377 --file price.sql


This caters for developing the step code entirely outside the pipeline for development and debug.
We can run pgembed to create a local db, pgload to add the price schema. Then run individual scrapes from a pipenv pip install -e version of the the price scraper package.


For unit testing we can create a mock response object for the data scrapers that returns different fixture payloads based on the query and monkeypatch it in. This allows us to functionally test the whole scrape and data artefact creation ETL cycle as unit functional tests.

Any issues with source data changes can be replicated via a fixture for regression tests.

class MockResponse:

"""Fake to return fixture value of requests.get() for testing scrape parsing"""

name = "Mock User"
payload = {}
content = ""
status_code = 200
url = "http://mock_url"

def __init__(self, payload={}, url="http://mock_url"):
self.url = url
self.payload = payload
self.content = str(payload)

def json(self):
return self.payload


def mock_aws_get(url, **kwargs):
    """Return the fixture JSON that matches the URL used"""
for key, fix in fixtures.items():
if key in url:
return MockResponse(payload=fix, url=url)
return MockResponse()

class TestAWSScrape(TestCase):
"""Tests for the 'pscraper awsscrape' command"""

def setUpClass():
"""Simple monkeypatch in mock handlers for all tests in the class"""
psycopg.connect = MockConn
requests.get = mock_aws_get
# confirm that requests is patched hence returns short fixture of JSON from the AWS URLs
result = requests.get("{}/AmazonS3/current/index.json".format(ROOT))
assert len(result.json().keys()) > 5 and len(result.content) < 2000

A simple DAG with Soda Data validation

The click commands for each DAG are imported at the top, one for the scrape and one for postgres embedded, the DAG just becomes a wrapper to run them, adding Soda data validation of the scraped data ...

def scrape_azure():
   """Scrape Azure via API public json web pages"""
   from price_scraper.commands import azurescrape, pgembed
   folder, port = setup_pg_db(PORT)
   error = azurescrape.run_azure_scrape(port, HOST)
   if not error:
       error = csp_dump(port, "azure")
   if error:
       pgembed.teardown_pg_embed(folder) 
       notify_slack("azure", error)
       raise AirflowFailException(error)
  
   data_test = SodaScanOperator(
       dag=dag,
       task_id="data_test",
       data_sources=[
           {
               "data_source_name": "embedpg",
               "soda_config_path": "price-scraper/soda/configuration_azure.yml",
           }
       ],
       soda_cl_path="price-scraper/soda/price_azure_checks.yml",
   )
   data_test.execute(dict())
   pgembed.teardown_pg_embed(folder)
 


We setup a new Embedded Postgres (takes a few seconds) and then scrape directly to it.


We then use the SodaScanOperator to check the data we have scraped, if there is no error we dump to blob storage otherwise notify Slack with the error and raise it ending the DAG

Our Soda tests check that the number of and prices are in the ranges that they should be for each service. We also check we have the amount of tiered rates that we expect. We expect over 10 starting usage rates and over 3000 specific tiered prices.

If the Soda tests pass, we dump to cloud storage and teardown temporary Postgres. A final step aggregates together each steps data. We save the money and maintenance of running a persistent database cluster in the cloud for our pipeline.


June 15, 2025 06:00 PM UTC


PyPy

How fast can the RPython GC allocate?

While working on a paper about allocation profiling in VMProf I got curious about how quickly the RPython GC can allocate an object. I wrote a small RPython benchmark program to get an idea of the order of magnitude.

The basic idea is to just allocate an instance in a tight loop:

class A(object):
    pass

def run(loops):
    # preliminary idea, see below
    for i in range(loops):
        a = A()
        a.i = i

The RPython type inference will find out that instances of A have a single i field, which is an integer. In addition to that field, every RPython object needs one word of GC meta-information. Therefore one instance of A needs 16 bytes on a 64-bit architecture.

However, measuring like this is not good enough, because the RPython static optimizer would remove the allocation since the object isn't used. But we can confuse the escape analysis sufficiently by always keeping two instances alive at the same time:

class A(object):
    pass

def run(loops):
    a = prev = None
    for i in range(loops):
        prev = a
        a = A()
        a.i = i
    print(prev, a) # print the instances at the end

(I confirmed that the allocation isn't being removed by looking at the C code that the RPython compiler generates from this.)

This is doing a little bit more work than needed, because of the a.i = i instance attribute write. We can also (optionally) leave the field uninitialized.

def run(initialize_field, loops):
    t1 = time.time()
    if initialize_field:
        a = prev = None
        for i in range(loops):
            prev = a
            a = A()
            a.i = i
        print(prev, a) # make sure always two objects are alive
    else:
        a = prev = None
        for i in range(loops):
            prev = a
            a = A()
        print(prev, a)
    t2 = time.time()
    print(t2 - t1, 's')
    object_size_in_words = 2 # GC header, one integer field
    mem = loops * 8 * object_size_in_words / 1024.0 / 1024.0 / 1024.0
    print(mem, 'GB')
    print(mem / (t2 - t1), 'GB/s')

Then we need to add some RPython scaffolding:

def main(argv):
    loops = int(argv[1])
    with_init = bool(int(argv[2]))
    if with_init:
        print("with initialization")
    else:
        print("without initialization")
    run(with_init, loops)
    return 0

def target(*args):
    return main

To build a binary:

pypy rpython/bin/rpython targetallocatealot.py

Which will turn the RPython code into C code and use a C compiler to turn that into a binary, containing both our code above as well as the RPython garbage collector.

Then we can run it (all results again from my AMD Ryzen 7 PRO 7840U, running Ubuntu Linux 24.04.2):

$ ./targetallocatealot-c 1000000000 0
without initialization
<A object at 0x7c71ad84cf60> <A object at 0x7c71ad84cf70>
0.433825 s
14.901161 GB
34.348322 GB/s
$ ./targetallocatealot-c 1000000000 1
with initialization
<A object at 0x71b41c82cf60> <A object at 0x71b41c82cf70>
0.501856 s
14.901161 GB
29.692100 GB/s

Let's compare it with the Boehm GC:

$ pypy rpython/bin/rpython --gc=boehm --output=targetallocatealot-c-boehm targetallocatealot.py 
...
$ ./targetallocatealot-c-boehm 1000000000 0
without initialization
<A object at 0xffff8bd058a6e3af> <A object at 0xffff8bd058a6e3bf>
9.722585 s
14.901161 GB
1.532634 GB/s
$ ./targetallocatealot-c-boehm 1000000000 1
with initialization
<A object at 0xffff88e1132983af> <A object at 0xffff88e1132983bf>
9.684149 s
14.901161 GB
1.538717 GB/s

This is not a fair comparison, because the Boehm GC uses conservative stack scanning, therefore it cannot move objects, which requires much more complicated allocation.

Let's look at perf stats

We can use perf to get some statistics about the executions:

$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,faults,migrations ./targetallocatealot-c 10000000000 0
without initialization
<A object at 0x7aa260e35980> <A object at 0x7aa260e35990>
4.301442 s
149.011612 GB
34.642245 GB/s

 Performance counter stats for './targetallocatealot-c 10000000000 0':

     7,244,117,828      cache-references                                                      
        23,446,661      cache-misses                     #    0.32% of all cache refs         
    21,074,240,395      cycles                                                                
   110,116,790,943      instructions                     #    5.23  insn per cycle            
    20,024,347,488      branches                                                              
             1,287      faults                                                                
                24      migrations                                                            

       4.303071693 seconds time elapsed

       4.297557000 seconds user
       0.003998000 seconds sys

$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,faults,migrations ./targetallocatealot-c 10000000000 1
with initialization
<A object at 0x77ceb0235980> <A object at 0x77ceb0235990>
5.016772 s
149.011612 GB
29.702688 GB/s

 Performance counter stats for './targetallocatealot-c 10000000000 1':

     7,571,461,470      cache-references                                                      
       241,915,266      cache-misses                     #    3.20% of all cache refs         
    24,503,497,532      cycles                                                                
   130,126,387,460      instructions                     #    5.31  insn per cycle            
    20,026,280,693      branches                                                              
             1,285      faults                                                                
                21      migrations                                                            

       5.019444749 seconds time elapsed

       5.012924000 seconds user
       0.005999000 seconds sys

This is pretty cool, we can run this loop with >5 instructions per cycle. Every allocation takes 110116790943 / 10000000000 ≈ 11 instructions and 21074240395 / 10000000000 ≈ 2.1 cycles, including the loop around it.

How often does the GC run?

The RPython GC queries the L2 cache size to determine the size of the nursery. We can find out what it is by turning on PYPYLOG, selecting the proper logging categories, and printing to stdout via :-:

$ PYPYLOG=gc-set-nursery-size,gc-hardware:- ./targetallocatealot-c 1 1
[f3e6970465723] {gc-set-nursery-size
nursery size: 270336
[f3e69704758f3] gc-set-nursery-size}
[f3e697047b9a1] {gc-hardware
L2cache = 1048576
[f3e69705ced19] gc-hardware}
[f3e69705d11b5] {gc-hardware
memtotal = 32274210816.000000
[f3e69705f4948] gc-hardware}
[f3e6970615f78] {gc-set-nursery-size
nursery size: 4194304
[f3e697061ecc0] gc-set-nursery-size}
with initialization
NULL <A object at 0x7fa7b1434020>
0.000008 s
0.000000 GB
0.001894 GB/s

So the nursery is 4 MiB. This means that when we allocate 14.9 GiB the GC needs to perform 10000000000 * 16 / 4194304 ≈ 38146 minor collections. Let's confirm that:

$ PYPYLOG=gc-minor:out ./targetallocatealot-c 10000000000 1
with initialization
w<A object at 0x7991e3835980> <A object at 0x7991e3835990>
5.315511 s
149.011612 GB
28.033356 GB/s
$ head out
[f3ee482f4cd97] {gc-minor
[f3ee482f53874] {gc-minor-walkroots
[f3ee482f54117] gc-minor-walkroots}
minor collect, total memory used: 0
number of pinned objects: 0
total size of surviving objects: 0
time taken: 0.000029
[f3ee482f67b7e] gc-minor}
[f3ee4838097c5] {gc-minor
[f3ee48380c945] {gc-minor-walkroots
$ grep "{gc-minor-walkroots" out | wc -l
38147

Each minor collection is very quick, because a minor collection is O(surviving objects), and in this program only one object survive each time (the other instance is in the process of being allocated). Also, the GC root shadow stack is only one entry, so walking that is super quick as well. The time the minor collections take is logged to the out file:

$ grep "time taken" out | tail
time taken: 0.000002
time taken: 0.000002
time taken: 0.000002
time taken: 0.000002
time taken: 0.000002
time taken: 0.000002
time taken: 0.000002
time taken: 0.000003
time taken: 0.000002
time taken: 0.000002
$ grep "time taken" out | grep -o "0.*" | numsum
0.0988160000000011

(This number is super approximate due to float formatting rounding.)

that means that 0.0988160000000011 / 5.315511 ≈ 2% of the time is spent in the GC.

What does the generated machine code look like?

The allocation fast path of the RPython GC is a simple bump pointer, in Python pseudo-code it would look roughly like this:

result = gc.nursery_free
# Move nursery_free pointer forward by totalsize
gc.nursery_free = result + totalsize
# Check if this allocation would exceed the nursery
if gc.nursery_free > gc.nursery_top:
    # If it does => collect the nursery and al
    result = collect_and_reserve(totalsize)
result.hdr = <GC flags and type id of A>

So we can disassemble the compiled binary targetallocatealot-c and try to find the equivalent logic in machine code. I'm super bad at reading machine code, but I tried to annotate what I think is the core loop (the version without initializing the i field) below:

    ...
    cb68:   mov    %rbx,%rdi 
    cb6b:   mov    %rdx,%rbx

    # initialize object header of object allocated in previous iteration
    cb6e:   movq   $0x4c8,(%rbx)

    # loop termination check
    cb75:   cmp    %rbp,%r12
    cb78:   je     ccb8

    # load nursery_free
    cb7e:   mov    0x33c13(%rip),%rdx

    # increment loop counter
    cb85:   add    $0x1,%rbp

    # add 16 (size of object) to nursery_free
    cb89:   lea    0x10(%rdx),%rax

    # compare nursery_top with new nursery_free
    cb8d:   cmp    %rax,0x33c24(%rip)

    # store new nursery_free
    cb94:   mov    %rax,0x33bfd(%rip)

    # if new nursery_free exceeds nursery_top, fall through to slow path, if not, start at top
    cb9b:   jae    cb68

    # slow path from here on:
    # save live object from last iteration to GC shadow stack
    cb9d:   mov    %rbx,-0x8(%rcx)
    cba1:   mov    %r13,%rdi
    cba4:   mov    $0x10,%esi
    # do minor collection
    cba9:   call   20800 <pypy_g_IncrementalMiniMarkGC_collect_and_reserve>
    ...

Running the benchmark as regular Python code

So far we ran this code as RPython, i.e. type inference is performed and the program is translated to a C binary. We can also run it on top of PyPy, as a regular Python3 program. However, an instance of a user-defined class in regular Python when run on PyPy is actually a much larger object, due to dynamic typing. It's at least 7 words, which is 56 bytes.

However, we can simply use int objects instead. Integers are allocated on the heap and consist of two words, one for the GC and one with the machine-word-sized integer value, if the integer fits into a signed 64-bit representation (otherwise a less compact different representation is used, which can represent arbitrarily large integers).

Therefore, we can simply use this kind of code:

import sys, time


def run(loops):
    t1 = time.time()
    a = prev = None
    for i in range(loops):
        prev = a
        a = i
    print(prev, a) # make sure always two objects are alive
    t2 = time.time()
    object_size_in_words = 2 # GC header, one integer field
    mem = loops * 28 / 1024.0 / 1024.0 / 1024.0
    print(mem, 'GB')
    print(mem / (t2 - t1), 'GB/s')

def main(argv):
    loops = int(argv[1])
    run(loops)
    return 0

if __name__ == '__main__':
    sys.exit(main(sys.argv))

In this case we can't really leave the value uninitialized though.

We can run this both with and without the JIT:

$ pypy3 allocatealot.py 1000000000
999999998 999999999
14.901161193847656 GB
17.857494904899553 GB/s
$ pypy3 --jit off allocatealot.py 1000000000
999999998 999999999
14.901161193847656 GB
0.8275382375297171 GB/s

This is obviously much less efficient than the C code, the PyPy JIT generates much less efficient machine code than GCC. Still, "only" twice as slow is kind of cool anyway.

(Running it with CPython doesn't really make sense for this measurements, since CPython ints are bigger – sys.getsizeof(5) reports 28 bytes.)

The machine code that the JIT generates

Unfortunately it's a bit of a journey to show the machine code that PyPy's JIT generates for this. First we need to run with all jit logging categories:

$ PYPYLOG=jit:out pypy3 allocatealot.py 1000000000

Then we can read the log file to find the trace IR for the loop under the logging category jit-log-opt:

+532: label(p0, p1, p6, p9, p11, i34, p13, p19, p21, p23, p25, p29, p31, i44, i35, descr=TargetToken(137358545605472))
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-9~#24 FOR_ITER')

# are we at the end of the loop
+552: i45 = int_lt(i44, i35)
+555: guard_true(i45, descr=<Guard0x7ced4756a160>) [p0, p6, p9, p11, p13, p19, p21, p23, p25, p29, p31, p1, i44, i35, i34]
+561: i47 = int_add(i44, 1)
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-9~#26 STORE_FAST')
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-10~#28 LOAD_FAST')
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-10~#30 STORE_FAST')
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-11~#32 LOAD_FAST')
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-11~#34 STORE_FAST')
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-11~#36 JUMP_ABSOLUTE')

# update iterator object
+565: setfield_gc(p25, i47, descr=<FieldS pypy.module.__builtin__.functional.W_IntRangeIterator.inst_current 8>)
+569: guard_not_invalidated(descr=<Guard0x7ced4756a1b0>) [p0, p6, p9, p11, p19, p21, p23, p25, p29, p31, p1, i44, i34]

# check for signals
+569: i49 = getfield_raw_i(137358624889824, descr=<FieldS pypysig_long_struct_inner.c_value 0>)
+582: i51 = int_lt(i49, 0)
+586: guard_false(i51, descr=<Guard0x7ced4754db78>) [p0, p6, p9, p11, p19, p21, p23, p25, p29, p31, p1, i44, i34]
debug_merge_point(0, 0, 'run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-9~#24 FOR_ITER')

# allocate the integer (allocation sunk to the end of the trace)
+592: p52 = new_with_vtable(descr=<SizeDescr 16>)
+630: setfield_gc(p52, i34, descr=<FieldS pypy.objspace.std.intobject.W_IntObject.inst_intval 8 pure>)
+634: jump(p0, p1, p6, p9, p11, i44, p52, p19, p21, p23, p25, p29, p31, i47, i35, descr=TargetToken(137358545605472))

To find the machine code address of the trace, we need to search for this line:

Loop 1 (run;/home/cfbolz/projects/gitpypy/allocatealot.py:6-9~#24 FOR_ITER) \
    has address 0x7ced473ffa0b to 0x7ced473ffbb0 (bootstrap 0x7ced473ff980)

Then we can use a script in the PyPy repo to disassemble the generated machine code:

$ pypy rpython/jit/backend/tool/viewcode.py out

This will dump all the machine code to stdout, and open a pygame-based graphviz cfg. In there we can search for the address and see this:

Graphviz based visualization of the machine code the JIT generates

Here's an annotated version with what I think this code does:

# increment the profile counter
7ced473ffb40:   48 ff 04 25 20 9e 33    incq   0x38339e20
7ced473ffb47:   38 

# check whether the loop is done
7ced473ffb48:   4c 39 fe                cmp    %r15,%rsi
7ced473ffb4b:   0f 8d 76 01 00 00       jge    0x7ced473ffcc7

# increment iteration variable
7ced473ffb51:   4c 8d 66 01             lea    0x1(%rsi),%r12

# update iterator object
7ced473ffb55:   4d 89 61 08             mov    %r12,0x8(%r9)

# check for ctrl-c/thread switch
7ced473ffb59:   49 bb e0 1b 0b 4c ed    movabs $0x7ced4c0b1be0,%r11
7ced473ffb60:   7c 00 00 
7ced473ffb63:   49 8b 0b                mov    (%r11),%rcx
7ced473ffb66:   48 83 f9 00             cmp    $0x0,%rcx
7ced473ffb6a:   0f 8c 8f 01 00 00       jl     0x7ced473ffcff

# load nursery_free pointer
7ced473ffb70:   49 8b 8b d8 30 f6 fe    mov    -0x109cf28(%r11),%rcx

# add size (16)
7ced473ffb77:   48 8d 51 10             lea    0x10(%rcx),%rdx

# compare against nursery top
7ced473ffb7b:   49 3b 93 f8 30 f6 fe    cmp    -0x109cf08(%r11),%rdx

# jump to slow path if nursery is full
7ced473ffb82:   0f 87 41 00 00 00       ja     0x7ced473ffbc9

# store new value of nursery free
7ced473ffb88:   49 89 93 d8 30 f6 fe    mov    %rdx,-0x109cf28(%r11)

# initialize GC header
7ced473ffb8f:   48 c7 01 30 11 00 00    movq   $0x1130,(%rcx)

# initialize integer field
7ced473ffb96:   48 89 41 08             mov    %rax,0x8(%rcx)
7ced473ffb9a:   48 89 f0                mov    %rsi,%rax
7ced473ffb9d:   48 89 8d 60 01 00 00    mov    %rcx,0x160(%rbp)
7ced473ffba4:   4c 89 e6                mov    %r12,%rsi
7ced473ffba7:   e9 94 ff ff ff          jmp    0x7ced473ffb40
7ced473ffbac:   0f 1f 40 00             nopl   0x0(%rax)

Conclusion

The careful design of the RPython GC's allocation fast path gives pretty good allocation rates. This technique isn't really new, it's a pretty typical way to design a GC. Apart from that, my main conclusion would be that computers are fast or something? Indeed, when we ran the same code on my colleague's two-year-old AMD, we got quite a bit worse results, so a lot of the speed seems to be due to the hard work of CPU architects.

June 15, 2025 01:48 PM UTC

June 13, 2025


Real Python

The Real Python Podcast – Episode #253: Starting With Marimo Notebooks & Python App Config Management

Looking for a guide on getting started with Marimo notebooks? How do you build a reproducible notebook for sharing or create a dashboard with interactive UI elements? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 13, 2025 12:00 PM UTC


Daniel Roy Greenfeld

TIL: HTML 404 errors for FastHTML

from fastapi import FastAPI
from fastapi.responses import HTMLResponse


async def custom_404_exception_handler(request, exc):
    return HTMLResponse(
        f'<p>404 Not Found at "{request.url.path}"</p>', status_code=404
    )

# Add more HTTP exceptions as needed
HTTP_EXCEPTIONS = {404: custom_404_exception_handler}

app = FastAPI(exception_handlers=HTTP_EXCEPTIONS)


@app.get("/")
async def read_root():
    return {"Hello": "World"}

Try it out by running the app and going to a non-existent path, like /not-found. You should see a simple HTML page with a 404 message.

June 13, 2025 02:30 AM UTC

June 12, 2025


Peter Bengtsson

A Python dict that can report which keys you did not use

Demonstrates a very basic way, in Python, how to know which fields of a dict you never accessed.

June 12, 2025 08:25 PM UTC


Robin Wilson

More links – June 2025

I’ve got into a bit of a habit of writing occasional posts with links to interesting things I’ve found (probably because it’s a relatively easy blog post to write). This is another of those posts – this time, written in June 2025. So, let’s get on with some links:

June 12, 2025 12:10 PM UTC


Armin Ronacher

Agentic Coding Recommendations

There is currently an explosion of people sharing their experiences with agentic coding. After my last two posts on the topic, I received quite a few questions about my own practices. So, here goes nothing.

Preface

For all intents and purposes, here's what I do: I predominently use Claude Code with the cheaper Max subscription for $100 a month [1]. That works well for several reasons:

  • I exclusively use the cheaper Sonnet model. It's perfectly adequate for my needs, and in fact, I prefer its outputs over the more expensive Opus model.
  • I optimize my tool usage to be token efficient. I avoid screenshots and browser interactions wherever possible. More on that later.

My general workflow involves assigning a job to an agent (which effectively has full permissions) and then waiting for it to complete the task. I rarely interrupt it, unless it's a small task. Consequently, the role of the IDE — and the role of AI in the IDE — is greatly diminished; I mostly use it for final edits. This approach has even revived my usage of Vim, which lacks AI integration.

One caveat: I expect this blog post to age very poorly. The pace of innovation here is insane; what was true a month ago barely holds true today. That's why I'm sticking to concepts I believe have staying power.

If you want to a small session of me working on an Open Source library with it, I have a recording you can watch.

The Basics

I disable all permission checks. Which basically means I run claude --dangerously-skip-permissions. More specifically I have an alias called claude-yolo set up. Now you can call that irresponsible and there are definitely risks with it, but you can manage those risks with moving your dev env into docker. I will however say that if you can watch it do its thing a bit, it even works surprisingly well without dockerizing. YMMV.

MCP. This is a term you cannot avoid. It basically is a standardized protocol to give agents access to more tools. Honestly: at this point I barely use it, but I do use it. The reason I barely use it is because Claude Code is very capable of just running regular tools. So MCP for me is really only needed if I need to give Claude access to something that finds too hard to use otherwise. A good example for this is the playwright-mcp for browser automation. I use it because I haven't found anything better yet. But for instance when I want my agent to poke around in my database, I just uses whatever it finds to be available. In my case it loves to use psql and that's more than good enough.

In general I really only start using MCP if the alternative is too unreliable. That's because MCP servers themselves are sometimes not super reliable and they are an extra thing that can go wrong. Trying to keep things very simple. My custom tools are normal scripts that it just runs.

Choice Of Language

I've evaluated agent performance across different languages my workload, and if you can choose your language, I strongly recommend Go for new backend projects. Several factors strongly favor Go:

  • Context system: Go provides a capable copy-on-write data bag that explicitly flows through the code execution path, similar to contextvars in Python or .NET's execution context. Its explicit nature greatly simplifies things for AI agents. If the agent needs to pass stuff to any call site, it knows how to do it.
  • Test caching: Surprisingly crucial for efficient agentic loops. In Rust, agents sometimes fail because they misunderstand cargo test's invocation syntax. In Go, tests run straightforwardly and incrementally, significantly enhancing the agentic workflow. It does not need to figure out which tests to run, go does.
  • Go is sloppy: Rob Pike famously described Go as suitable for developers who aren't equipped to handle a complex language. Substitute “developers” with “agents,” and it perfectly captures why Go's simplicity benefits agentic coding.
  • Structural interfaces: interfaces in Go are structural. If a type has the methods an interface expects, then it conforms. This is incredibly easy for LLMs to “understand”. There is very little surprise for the agent.
  • Go has low eco-system churn: Go's entire ecosystem embraces backwards compatiblity and explicit version moves. This greatly reduces the likelihood of AI generating outdated code — starkly contrasting JavaScript's fast-moving ecosystem for instance.

For comparison, Python — my initial choice — often poses significant challenges. Agents struggle with Python's magic (eg: Pytest’s fixture injection) or complex runtime challenges (eg: wrong event loop when working with async), frequently producing incorrect code that even the agentic loop has challenges resolving. Python also has practical performance problems. I don't mean that it writes slow code, i mean that the agent loop is really slow. That's because the agent loves to spawn processes and test scripts, and it can take quite a while for the interpreter to boot up and initialize the entire application.

On the frontend I settled on tailwind, react with tanstack's query and router as well as vite. I'm not amazingly happy with it, but I found it better than the alternatives. Tailwind and vite are great, no complaints there. Tanstack's file based router does not make me super happy. In parts because it likes to have dollar signs in the file names and those really like to confuse the agent. For instance it's quite common that it tries to edit $param.tsx but it edits the file .tsx instead because it to confused by shell interpolation. It's a minor thing, but a very annoying thing.

Tools, Tools, Tools

Regardless of language, effective tooling is paramount. Key rules:

  • Anything can be a tool. A shell script can be a tool, an MCP server can be a tool, a log file can be a tool. If the agent can interact or observe it, it counts.
  • Tools need to be fast. The quicker they respond (and the less useless output they produce) the better. Crashes are tolerable; hangs are problematic.
  • Tools need to be user friendly! Tools must clearly inform agents of misuse or errors to ensure forward progress.
  • Tools need to be protected against an LLM chaos monkey using them completely wrong. There is no such things as user error or undefined behavior!
  • Tools need to provide the right debuggability and observability.

Example: I place critical tools into a Makefile. A make dev command starts my services via a process manager. I made two key changes to what I normally do:

  1. I had to protect the process manager against being spawend a second time when it's already running. I use a fork of shoreman which writes a pidfile. When spanwed a second time it errors and logs “services already running”. Why does that matter? Because the agent sometimes does not know if the server is already running and otherwise happily spawns it a second time resulting in two version of the service bound to the same port.
  2. I am not only printing the output to my terminal, I'm also always logging it to a file. This allows the agent to read the log file to diagnose what's happening.

Here an example agentic interaction that demonstrates this rather well:

⏺ Great! The formatting looks good. Now let me run the development server to test the
   dynamic onboarding:

⏺ Bash(make dev)
  ⎿  Error: error: services are already running
     make: *** [dev] Error 1

⏺ The development server is already running. Let me check the logs to see the frontend URL:

⏺ Bash(make tail-log)
  ⎿ 01:06:55 backend    | Starting server on localhost:8080
    01:07:39 backend    | [Running: sh -c 'cd backend && go run cmd/server/main.go serve']
    01:07:39 backend    | Starting server on localhost:8080
    08:45:34 backend    | "GET http://localhost:8080/api/users/me HTTP/1.1" from 127.0.0.1:61804 - 200
    39.252666ms
    … +96 lines (ctrl+r to expand)

⏺ Perfect! I can see the frontend is running on http://localhost:3000/ and the latest changes
   have been hot-reloaded. Let me test the dynamic onboarding functionality using playwright:

In general logging is super important. For instance my app currently has a sign in and register flow that sends an email to the user. In debug mode (which the agent runs in), the email is just logged to stdout. This is crucial! It allows the agent to complete a full sign-in with a remote controlled browser without extra assistence. It knows that emails are being logged thanks to a CLAUDE.md instruction and it automatically consults the log for the necessary link to click.

Does it count as a tool? In my book it does.

It's All About Speed

Agentic coding's inefficiency largely arises from inference cost and suboptimal tool usage. Let me reiterate: quick, clear tool responses are vital. What we did not talk about yet is that some tools are "emergent," temporarily written by agents themselves. Quick compilation and execution significantly boost productivity of the agent. So how can we help it?

With the right instructions it must be possible for the AI to create a new tool, by following existing conventions very quickly. This is necessary because you want the AI to write some new code, and run it. There is a big difference in the quality and speed of the flow, if that tool takes 3ms to run vs it compiles for 5 seconds and then needs another minute to boot and connect to database and kafka broker and 100 lines of nonsensical log output.

If your stuff is indeed slow, then consider vibe-coding a daemon that you can dynamically load stuff into. As an example Sentry takes too long to reload code and it takes too long to restart. To trial some agentic coding there my workaround was a module that watches a file system location and just imports and executes all python modules placed there, then writes the outputs into a log it can cat. That's not perfect, but it was a significant help for the agent to evaluate some basic code in the context of the application.

Balancing log verbosity is crucial: informative yet concise logs optimize token usage and inference speed, avoiding unnecessary costs and rate limits. If you cannot find the balance, provide some easy to turn knobs for the AI to control.

In an idea setup you get useful log output as a natural byproduct of the agent writing code. Getting observability from the first shot of code generation beats writing code, failing to run it and only then going back to a debug loop where debug information is added.

Stability and Copy/Paste

Stable ecosystems are what you really want. LLMs are great with Go and they love to use Flask, because those are quite stable ecosystems with little churn. The same thing is true for your codebase. The AI likes to leave all kinds of breadcrumbs lying around when writing code that can turn into confusion later. For instance I have seen the agents leave useful comments about why it chose one path over another. If you nilly-willy let the AI upgrade libraries where some of those decisions no longer make sense, you now might have the AI continue making a now outdated pattern.

In theory this should be the same for agents and humans, but the reality is that agents make upgrades so “cheap” that it's tempting to just let the AI do it and see if tests still pass. I do not find this to be a successful path at all. Be even more conservative about upgrades than before.

Likewise with AI I strongly prefer more code generation over using more dependencies. I wrote about why you should write your own code before, but the more I work with agentic coding, the more I am convinced of this.

Write Simple Code

Simple code significantly outperforms complex code in agentic contexts. I just recently wrote about ugly code and I think in the context of agents this is worth re-reading. Have the agent do “the dumbest possible thing that will work”.

  • Prefer functions with clear, descriptive and longer than usual function names over classes.
  • Avoid inheritance and overly clever hacks.
  • Use plain SQL. I mean it. You get excellent SQL out of agents and they can match the SQL they write with the SQL logs. That beats them min-maxing your ORM's capabilities and getting lost in the SQL output in a log.
  • Keep important checks local. You really want to make sure that permission checks are very clear to the AI, and that they are taking place where it AI can see it. Hiding permission checks in another file or some config file will amost guarantee you that the AI will forget to add permission checks in when adding new routes.

Make It Parallelizable

Agents aren't exceptionally fast individually, but parallelization boosts overall efficiency. Find a way to manage shared states like the file system, databases, or Redis instances so that you can run more than one. Avoid them, or find a way to quickly segment stuff out.

Your initial shared state is just the file system and a second check-out will do. But really I don't have an amazing solution yet. There are some good initial attempts. For instance one of the tools to watch is container-use. It's an MCP server that instructs Claude or other agents to run their experiments entirely in Docker.

Then there are tools like Cursor's background agents and Codex which are moving this entire stuff into CI which will be interesting. So far, I don't this is working for me yet, but let's see again in a month.

Learn To Refactor

Agentic coding alters refactoring priorities. Agents handle tasks effectively until project complexity surpasses some manageable thresholds. Too big here is defined by the total amount of stuff that it has to consider. So for instance you can vibe code your frontend together for a while, but eventually you reach the point where you absolutely need to tell it to make a component library. Why? Because if the total tailwind class mess is splitered across 50 files you will find it very hard to get the AI to make redesigns or extract components without major regressions.

An agentic workflow encourages good code maintenance and refactoring at the right moment. You don't want to do it too early and you definitely do not want to do it too late.

What Next?

Agentic coding is rapidly evolving, and my workflow today may look dramatically different tomorrow. What's clear though is that integrating agents into your development process can unlock significant productivity gains. I encourage you to keep experimenting. The tools and techniques will evolve, but the core principles — simplicity, stability, observability and smart parallelization — will remain essential.

Ultimately, the goal is not just to leverage agents to write code faster, but to write better, more maintainable, and resilient code. Already today the code looks nothing like the terrible slop from a few months ago. Stay adaptable, and happy coding!

[1]This is not an advertisment for Claude Code. It's just the agent I use at the moment. What else is there? Alternatives that are similar in their user experiences are OpenCode, goose, Codex and many others. There is also Devin and Cursor's background agents but they work a bit different in that they run in the cloud.

June 12, 2025 12:00 AM UTC

June 11, 2025


Trey Hunner

My PyCon US 2025 recap

I’m pretty much fully back to normal life after PyCon US 2025.

I started writing this post shortly after PyCon, got side-tracked, and now I’m finally publishing it.

My very quick recap: I spent a ton of time at PyCon chatting with folks and I really enjoyed it. As Ned wrote, it’s like summer camp.

My pre-conference (Monday and Tuesday)

I flew in a couple days early (Monday May 12) because Breeze airlines had a non-stop flight from San Diego to Pittsburgh that only ran on Monday or Friday. I listened to The Upswing as I flew in and wondered how we can make our community less individualistic.

On Tuesday, I went on a walking tour of downtown Pittsburgh. Then I ate Indian food for lunch at Sree’s, which I chose because I saw they had great vegan lunch options and weren’t open for dinner. I was eating vegan during PyCon US (as I mostly do at home) and I remembered to do some research on vegan dinner restaurants in the area this year.

On Tuesday night, as CPython core devs started showing up in the Westin lobby, I went on a walk with Anthony Shaw. Anthony asked whether I was up for “an adventure” and I said yes without asking questions. We walked across the river to Randyland (a.k.a. “the mattress factory”). It was fun, interesting, and quite odd. We were the only ones there and had plenty of time to look around at all the interesting items and art pieces that Randy had collected.

Anthony and I did dinner at Condado’s and ran into more conference friends at the bar (Kattni, Rose, and Kojo).

My tutorial on decorators (Wednesday)

Wednesday was my tutorial day.

Attendees used sticky notes during the exercise sections to note when they needed help. I wrote about this system in my blog post of tips for giving a good PyCon tutorial.

The tutorial went well and the attendees seemed to enjoy it.

The newcomer’s orientation & education summit (Thursday)

Just before the opening reception on Thursday evening, Kojo, Sumana, and I ran the newcomer’s orientation. Figuring out what exactly we want to say to help orient folks to PyCon is always a bit challenging. A first-time PyCon attendee gave me some ideas for how we could do it even better next year. I’ve taken notes and will revisit them later.

I also attended a few talks during education summit earlier in the day. Reuven Lerner gave a talk on how he recommends his students use LLMs and I was pleased to hear many suggestions that are closely aligned with what I recommend as well as a few insights I hadn’t heard before.

My lightning talk (Friday)

I gave a lightning talk on Friday evening (the first talk day). I called it Systems, gates, and rage. It seemed to go over well. Folks occasionally told me throughout the conference that they enjoyed it. I won’t spoil the topic of my lightning talk (you’ll need to watch it) but it’s a topic that I’d been thinking about for a few months.

Rodrigo gave a meta lightning talk as the first talk of the first day. I gave a meta lightning talk last year in the same slot. I hope this becomes a tradition, where the first lightning talk is a talk about someone explaining how to give a lightning talk.

The keynotes

I really enjoyed the keynotes this year.

Cory Doctorow’s opening keynote was really thought-provoking and Lynn Root’s keynote was on the importance of “play”, which is a topic I’ve written about before to my mailing list.

The Marshall project keynote and The Carpentries keynote were also great, though I missed portions of each (beginning of one and end of another) and ended up watching the full videos online after the conference.

The hallway track

The “hallway track” is the way many PyCon attendees describe the discussions that happen organically in the hallway.

These discussions can happen at any time, including breaks, breakfast, lunch, and even during talks. I missed at least 3 talks this year because I was having a great discussion in the hallway and time got away from me.

PyCon’s venue in Pittsburgh included a great rooftop so the rooftop track / garden track was a lovely spin in the hallway track this year.

The open spaces

The open spaces are a very important part of PyCon for me.

Like the hallway track, the open spaces are mostly (usually) about having a conversation. Unlike the hallway track, the open spaces have a set topic.

I attended open spaces (and ran my own):

I enjoyed meeting folks with similar interests and having fun and productive conversations (or at least listening to interesting conversations) in those open spaces.

Dinners, games, and hotel lobby conversations

I ate at Bae Bae’s, Rosewater Grill, and APTEKA during the conference. I also ate at Condado’s at least 3 nights.

APTEKA was a fun trek for vegan PierogiCon with mostly non-vegan folks. I’m sure the pierogis weren’t authentic, but we all loved the food.

After dinner every night I went to the Westin lobby and either played the Cabo card game or chatted with folks (1, 2, 3).

On missing talks

The keynotes and lightning talks are usually my favorite parts of PyCon. I tried to watch as many live keynotes and lightning talk sessions as I could this year. The morning lightning talks were hard to make it to because I kept sleeping in late enough that I missed most of them.

Ultimately, I watched very few live talks. I missed talks due to sleeping in, attending open spaces, visiting booths in the exhibit hall, getting lost in conversations in the hallway, and mid-day taking naps (in my hotel room, not during talks!).

Hynek has written about the downside of the hallway track and the importance of attending talks. I sympathize with Hynek’s argument that the hallway track is a trade off and there are downsides to missing talks, for both attendees and speakers. I am grateful that folks give talks and I want to support folks who do give talks, and yet, I often find myself attending few talks.

Ironically, the one talk I ended up taking a nap through (again, in my hotel room, not in the talk) was Hynek’s. My brain was fried by the time of his talk and I’m glad I was able to watch it online the week after PyCon. He apparently had a completely packed room as it was!

I did make sure to show up to a talk by a conference friend who I wanted to support (Michael Dahlberg’s talk on honeybee swarms). I have also binge-watched a dozen or so talks online that I had planned to attend but missed during PyCon.

You can watch all talks online now (at least I think they’re all up now?).

Attend a Python event

I’m looking forward to PyCon coming to Long Beach next year. I’m in San Diego and it will be nice to not fly to a PyCon for once!

I highly recommend attending local Python meetups, attending regional Python conferences, and/or attending PyCon US. Most in-person Python community events are pretty lovely. The Python community generally goes out of its way to be more welcoming than most tech events.

See you in Long Beach!

June 11, 2025 11:45 PM UTC


Talk Python Blog

Deep Dives Complete: 600K Words of Talk Python Insights

It’s been a real journey. I’m thrilled to announce that every single Talk Python To Me episode now has a deep dive analysis for that show. This short post is just a bit of a celebration (and heads up!) about this new way to learn from the podcast.

What are deep dives?

A quick review: Our deep dives are a quick, detailed way to revisit the best points of an episode without scrubbing through the audio or reading the full transcripts. They include additional insights and resources beyond our episode page notes and links such as “What to Know If You’re New to Python”, “Key Definitions and Terms”, and “Learning Resources.”

June 11, 2025 09:12 PM UTC


The Python Coding Stack

Are Python Dictionaries Ordered Data Structures?

Order the boxes from smallest to largest.

Stand in a queue in the order you arrived at the shop.

You don't need me to define what the word "order" means in either of these instructions above.

In Python, some data structures are ordered. Others aren't.

So, what about dictionaries? Are they ordered?

Some History First

Let's start with Python versions before Python 3.6. The answer is clear and unambiguous: No, dictionaries in Python versions before 3.6 are definitely not ordered.

Python 3.6 was released in 2016. Therefore, we're referring to older versions of Python that are no longer supported. Still, this historical detour is relevant to what's coming later.

Let's use a dictionary to place people who join a queue (or line for those who use US English) in a shop. A dictionary is unlikely to be the best data structure for a queue of people, but we'll use it for this example:

All code blocks are available in text format at the end of this article • #1 • The code images used in this article are created using Snappify. [Affiliate link]

The values associated with each key are empty lists, ready to hold any items that these customers purchase from the shop. But we won't need these lists in this article, so they'll remain empty.

I no longer have Python versions older than 3.6 installed on my computer. However, when you display the dictionary in those older versions, you may see the items printed out in any order:

#2

You had no guarantee of the order of the items when fetching them one after the other, such as when you display the dictionary or iterate through it.

Dictionaries in Python 3.6 and 3.7 (and Later)

Python 3.6 changed how dictionaries are implemented in the main Python interpreter, CPython. This is the interpreter you're likely to be using, even if you don't know it.

As a result of this change, Python dictionaries now maintained the order of insertion of key-value pairs. Therefore, the first item you add to a dictionary will always be the first displayed or yielded in an iteration. The second item you add will always be in second place, and so on.

This was merely an implementation detail in Python 3.6 that came about because of other changes in how dictionaries are implemented behind the scenes. However, in Python 3.7, this feature was included as part of the Python language specification. Therefore, from Python 3.7 onwards, the order of insertion is guaranteed. You can rely on it!

#3

So, does that mean that Python dictionaries are now ordered data structures? Not so fast…

Dictionaries Preserve the Order of Insertion

Let's compare the dictionary you created with another one that has the same people but in a different order:

#4

The dictionaries queue and another_queue contain the same items–the same key-value pairs. But they're not in the same order.

However, Python still treats these dictionaries as equal. The fact that the two dictionaries have the same key-value pairs is sufficient to make these dictionaries equal. The order is not important.

Let's compare this characteristic with the equivalent one for lists by creating two lists:

#5

These lists have the same names but in a different order. However, the order of the items is a fundamental characteristic of lists. Therefore, these lists are not considered equal. This feature is part of the definition of all sequences, such as lists, tuples, and strings.

So, even though dictionaries maintain the order of insertion since Python 3.6/3.7, the order is not itself a key characteristic of a dictionary. This is an important distinction between dictionaries and lists (and other sequences).

This is why the Python documentation and other Python resources typically use the phrase "dictionaries preserve the order of insertion" rather than saying that dictionaries are ordered.

Dictionaries are not ordered data structures in the same way sequences are.


Support The Python Coding Stack


How about collections.OrderedDict?

There's another mapping that's derived from dictionaries that you can find in the collections module: OrderedDict.

This data type existed in Python before the changes to dictionaries in Python 3.6 and 3.7. As its name implies, it's a dictionary that's also ordered. So, is the OrderedDict data type redundant now that standard dictionaries preserve the order of insertion?

Let's recreate the queue and another_queue data structures using OrderedDict instead of standard dictionaries:

#6

Now, queue and another_queue, which are OrderedDict instances, are no longer equal even though they have the same key-value pairs. In an OrderedDict, the order matters. Recall that the order in a standard dictionary, even though it is preserved, doesn't matter–standard dictionaries with the same items but in a different order are still considered equal.

Note that I'm using a standard dictionary to create an OrderedDict for simplicity in this example. If you're still using an older version of Python (prior to 3.6), the dictionary will not maintain order, so this code will not work. Use a list of tuples instead, which is also a valid way to initialise an OrderedDict in modern versions of Python.

There are also other differences between OrderedDict and standard dictionaries. Therefore, you may still find a use for collections.OrderedDict.


Do you want to join a forum to discuss Python further with other Pythonistas? Upgrade to a paid subscription here on The Python Coding Stack to get exclusive access to The Python Coding Place's members' forum. More Python. More discussions. More fun.

Subscribe now

And you'll also be supporting this publication. I put plenty of time and effort into crafting each article. Your support will help me keep this content coming regularly and, importantly, will help keep it free for everyone.


Final Words

Different data structures have different characteristics. That's the point of having a large selection of data structures. There isn't one data structure to rule them all. Different needs require different data structures.

Sequences are ordered. The order of items within a sequence matters. That's why you can use an index to fetch an item based on its position in a sequence. Therefore, it makes sense that sequences with the same items but in a different order are considered different.

However, the defining characteristic of a dictionary is the mapping between a key and its value. You find a value by using its key in a dictionary. The preservation of the insertion order in dictionaries is a nice-to-have feature, but it's not central to how dictionaries work.

PS: You'll need today's material in the next article I'll publish in a few days on The Python Coding Stack.

Photo by Alina Chernii: https://www.pexels.com/photo/people-waiting-and-standing-by-wall-25211989/


Code in this article uses Python 3.13

The code images used in this article are created using Snappify. [Affiliate link]

You can also support this publication by making a one-off contribution of any amount you wish.

Support The Python Coding Stack


For more Python resources, you can also visit Real Python—you may even stumble on one of my own articles or courses there!

Also, are you interested in technical writing? You’d like to make your own writing more narrative, more engaging, more memorable? Have a look at Breaking the Rules.

And you can find out more about me at stephengruppetta.com

Further reading related to this article’s topic:


Appendix: Code Blocks

Code Block #1
queue = {"James": [], "Kate": [], "Andy": [], "Isabelle": []}
Code Block #2
queue
# # Display order was arbitrary before Python 3.6
# {'Kate': [], 'James': [], 'Isabelle': [], 'Andy': []}
Code Block #3
queue = {"James": [], "Kate": [], "Andy": [], "Isabelle": []}
queue
# # Starting from Python 3.7, the order is guaranteed
# {'James': [], 'Kate': [], 'Andy': [], 'Isabelle': []}
Code Block #4
queue = {"James": [], "Kate": [], "Andy": [], "Isabelle": []}
another_queue = {"Kate": [], "James": [], "Isabelle": [], "Andy": []}

queue == another_queue
# True
Code Block #5
queue_list = ["James", "Kate", "Andy", "Isabelle"]
another_queue_list = ["Kate", "James", "Isabelle", "Andy"]

queue_list == another_queue_list
# False
Code Block #6
from collections import OrderedDict

queue = OrderedDict({"James": [], "Kate": [], "Andy": [], "Isabelle": []})
another_queue = OrderedDict({"Kate": [], "James": [], "Isabelle": [], "Andy": []})

queue == another_queue
# False

For more Python resources, you can also visit Real Python—you may even stumble on one of my own articles or courses there!

Also, are you interested in technical writing? You’d like to make your own writing more narrative, more engaging, more memorable? Have a look at Breaking the Rules.

And you can find out more about me at stephengruppetta.com

June 11, 2025 09:12 PM UTC