skip to navigation
skip to content

Planet Python

Last update: November 20, 2024 09:43 PM UTC

November 20, 2024


Trey Hunner

Python Black Friday & Cyber Monday sales (2024)

Ready for some Python skill-building sales?

This is my seventh annual compilation of Python learning deals.

I’m publishing this post extra early this year, so bookmark this page and set a calendar event for yourself to check back on Friday November 29.

Currently live sales

Here are Python-related sales that are live right now:

Anticipated sales

Here are sales that will be live soon:

Here are some sales I expect to see, but which haven’t been announced yet:

Even more sales

Also see Adam Johnson’s Django-related Deals for Black Friday 2024 for sales on Adam’s books, courses from the folks at Test Driven, Django templates, and various other Django-related deals.

And for non-Python/Django Python deals, see the Awesome Black Friday / Cyber Monday deals GitHub repository and the BlackFridayDeals.dev website.

If you know of another sale (or a likely sale) please comment below or email me.

November 20, 2024 07:00 PM UTC


Real Python

NumPy Practical Examples: Useful Techniques

The NumPy library is a Python library used for scientific computing. It provides you with a multidimensional array object for storing and analyzing data in a wide variety of ways. In this tutorial, you’ll see examples of some features NumPy provides that aren’t always highlighted in other tutorials. You’ll also get the chance to practice your new skills with various exercises.

In this tutorial, you’ll learn how to:

  • Create multidimensional arrays from data stored in files
  • Identify and remove duplicate data from a NumPy array
  • Use structured NumPy arrays to reconcile the differences between datasets
  • Analyze and chart specific parts of hierarchical data
  • Create vectorized versions of your own functions

If you’re new to NumPy, it’s a good idea to familiarize yourself with the basics of data science in Python before you start. Also, you’ll be using Matplotlib in this tutorial to create charts. While it’s not essential, getting acquainted with Matplotlib beforehand might be beneficial.

Get Your Code: Click here to download the free sample code that you’ll use to work through NumPy practical examples.

Take the Quiz: Test your knowledge with our interactive “NumPy Practical Examples: Useful Techniques” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

NumPy Practical Examples: Useful Techniques

This quiz will challenge your knowledge of working with NumPy arrays. You won't find all the answers in the tutorial, so you'll need to do some extra investigating. By finding all the answers, you're sure to learn some interesting things along the way.

Setting Up Your Working Environment

Before you can get started with this tutorial, you’ll need to do some initial setup. In addition to NumPy, you’ll need to install the Matplotlib library, which you’ll use to chart your data. You’ll also be using Python’s pathlib library to access your computer’s file system, but there’s no need to install pathlib because it’s part of Python’s standard library.

You might consider using a virtual environment to make sure your tutorial’s setup doesn’t interfere with anything in your existing Python environment.

Using a Jupyter Notebook within JupyterLab to run your code instead of a Python REPL is another useful option. It allows you to experiment and document your findings, as well as quickly view and edit files. The downloadable version of the code and exercise solutions are presented in Jupyter Notebook format.

The commands for setting things up on the common platforms are shown below:

Fire up a Windows PowerShell(Admin) or Terminal(Admin) prompt, depending on the version of Windows that you’re using. Now type in the following commands:

Windows PowerShell
PS> python -m venv venv\
PS> venv\Scripts\activate
(venv) PS> python -m pip install numpy matplotlib jupyterlab
(venv) PS> jupyter lab
Copied!

Here you create a virtual environment named venv\, which you then activate. If the activation is successful, then the virtual environment’s name will precede your Powershell prompt. Next, you install numpy and matplotlib into this virtual environment, followed by the optional jupyterlab. Finally, you start JupyterLab.

Note: When you activate your virtual environment, you may receive an error stating that your system can’t run the script. Modern versions of Windows don’t allow you to run scripts downloaded from the Internet as a security feature.

To fix this, you need to type the command Set-ExecutionPolicy RemoteSigned, then answer Y to the question. Your computer will now run scripts that Microsoft has verified. Once you’ve done this, the venv\Scripts\activate command should work.

Fire up a terminal and type in the following commands:

Shell
$ python -m venv venv/
$ source venv/bin/activate
(venv) $ python -m pip install numpy matplotlib jupyterlab
(venv) $ jupyter lab
Copied!

Here you create a virtual environment named venv/, which you then activate. If the activation is successful, then the virtual environment’s name will precede your command prompt. Next, you install numpy and matplotlib into this virtual environment, followed by the optional jupyterlab. Finally, you start JupyterLab.

You’ll notice that your prompt is preceded by (venv). This means that anything you do from this point forward will stay in this environment and remain separate from other Python work you have elsewhere.

Now that you have everything set up, it’s time to begin the main part of your learning journey.

NumPy Example 1: Creating Multidimensional Arrays From Files

When you create a NumPy array, you create a highly-optimized data structure. One of the reasons for this is that a NumPy array stores all of its elements in a contiguous area of memory. This memory management technique means that the data is stored in the same memory region, making access times fast. This is, of course, highly desirable, but an issue occurs when you need to expand your array.

Suppose you need to import multiple files into a multidimensional array. You could read them into separate arrays and then combine them using np.concatenate(). However, this would create a copy of your original array before expanding the copy with the additional data. The copying is necessary to ensure the updated array will still exist contiguously in memory since the original array may have had non-related content adjacent to it.

Constantly copying arrays each time you add new data from a file can make processing slow and is wasteful of your system’s memory. The problem becomes worse the more data you add to your array. Although this copying process is built into NumPy, you can minimize its effects with these two steps:

  1. When setting up your initial array, determine how large it needs to be before populating it. You may even consider over-estimating its size to support any future data additions. Once you know these sizes, you can create your array upfront.

  2. The second step is to populate it with the source data. This data will be slotted into your existing array without any need for it to be expanded.

Next, you’ll explore how to populate a three-dimensional NumPy array.

Populating Arrays With File Data

In this first example, you’ll use the data from three files to populate a three-dimensional array. The content of each file is shown below, and you’ll also find these files in the downloadable materials:

The first file has two rows and three columns with the following content:

CSV file1.csv
1.1, 1.2, 1.3
1.4, 1.5, 1.6
Copied!

Read the full article at https://realpython.com/numpy-example/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 20, 2024 02:00 PM UTC


Julien Tayon

The advantages of HTML as a data model over basic declarative ORM approach

Very often, backend devs don't want to write code.

For this, we use one trick : derive HTML widget for presentation, database access, REST endpoints from ONE SOURCE of truth and we call it MODEL.

A tradition, and I insist it's a conservative tradition, is to use a declarative model where we mad the truth of the model from python classes.

By declaring a class we will implicitly declare it's SQL structure, the HTML input form for human readable interaction and the REST endpoint to access a graph of objects which are all mapped on the database.

Since the arrival of pydantic it makes all the more sense when it comes to empower a strongly type approach in python.

But is it the only one worthy ?

I speak here as a veteran of the trenchline which job is to read a list of entries of customer in an xls file from a project manager and change the faulty value based on the retro-engineering of an HTML formular into whatever the freak the right value is supposed to be.

In this case your job is in fact to short circuit the web framework to which you don't have access to change values directly into the database.

More often than never is these real life case you don't have access to the team who built the framework (to much bureaucracy to even get a question answered before the situation gets critical) ... So you look at the form.

And you guess the name of the table that is impacted by looking at the « network tab » in the developper GUI when you hit the submit button.

And you guess the name of the field impacted in the table to guess the name of the columns.

And then you use your only magical tool which is a write access to the database to reflect the expected object with an automapper and change values.

You could do it raw SQL I agree, but sometimes you need to do a web query in the middle to change the value because you have to ask a REST service what is the new ID of the client.

And you see the more this experience of having to tweak into real life frameworks that often surprise users for the sake of the limitation of the source of truth, the more I want the HTML to be the source of truth.

The most stoïcian approach to full stack framework approach : to derive Everything from an HTML page.

The views, the controllers, the route, the model in such a true way that if you modify the HTML you modify in real time the database model, the routes, the displayed form.



What are the advantages of HTML as a declarative language ?



Here, one of the tradition is to prefere the human readable languages such as YAML and JSON, or machine readable as XML over HTML.

However, JSON and YAML are more limited in expressiveness of data structure than HTML (you can have a dict as a key in a dict in json ? Me I can.)

And on the other hand XML is quite a pain to read and write without mistakes.

HTML is just XML



HTML is a lax and lenient grammarless XML. No parsers will raise an exception because you wrote "<br>" instead of "<br/>" (or the opposite). You can add non existent attributes to tags and the parser will understand this easily without you having to redefine a full fledge grammar.

HTML is an XML YOU CAN SEE.



There are some tags that are related to a grammar of visual widget to which non computer people are familiar with.

If you use a FORM as a mapping to a database table, and all input inside has A column name you have already input drawn on your screen.



Modern « remote procedure call » are web based



Call it RPC, call it soap, call it REST, nowadays the web technologies trust 99% of how computer systems exchange data between each others.

You buy something on the internet, at the end you interact with a web formular or a web call. Hence, we can assert with strong convictions that 100% of web technologies can serve web pages. Thus, if you use your html as a model and present it, therefore you can deduce the data model from the form without needing a new pivoting language.

Proof of concept



For the convenience of « fun » we are gonna imagine a backend for « agile by micro blogging » (à la former twitter).

We are gonna assume the platform is structured micro blogging around where agile shines the most : not when things are done, but to move things on.

Things that are done will be called statements. Like : « software is delivered. Here is a factoid (a git url for instance) ». We will call this nodes in a graph and are they will be supposed to immutable states that can't be contested.

Each statement answers another statement's factoid like a delivery statement tends to follow a story point (at least should lead by the mean of a transition.

Hence in this application we will mirco-blog about the transition ... like on a social network with members of concerned group.
The idea of the application is to replace scrum meetings with micro blogging.

Are you blocked ? Do you need anything ? Can be answered on the mirco blogging platform, and every threads that are presented archived, used for machine learning (about what you want to hear as a good news) in a data form that is convenient for large language model.

As such we want to harvest a text long enough to express emotions, constricted to a laughingly small amount of characters so that finesse and ambiguity are tough to raise. That's the heart of the application : harvesting comments tagged with associated emotions to ease the work of tagging for Artificial Intelligence.

Hear me out, this is just a stupid idea of mine to illustrate a graph like structure described with HTML, not a real life idea. Me I just love to represent State Machine Diagram with everything that fall under my hands.

Here is the entity relationship diagram I have in mind :


Let's see what a table declaration might look like in HTML, let's say transition :


<form action=/transition  >
	<input type=number name=id />
	<input type=number name=user_group_id nullable=false reference=user_group.id />
	<textarea name=message rows=10 cols=50 nullable=false ></textarea>
	<input type=url name=factoid />
	<select name="emotion_for_group_triggered" value=neutral >
		<option value="">please select a value</option>
		<option value=positive >Positive</option>
		<option value=neutral >Neutral</option>
		<option value=negative >Negative</option>
	</select>
	<input type=number name=expected_fun_for_group />
	<input type=number name=previous_statement_id reference=statement.id nullable=false />
	<input type=number name=next_statement_id reference=statement.id />
	<unique_constraint col=next_statement_id,previous_statement_id name=unique_transition ></unique_constraint>
	<input type=checkbox name=is_exception />
</form>


Through the use of additionnal tags of html and attributes we can convey a lot of informations usable for database construction/querying that are gonna be silent at the presentation (like unique_constraint). And with a little bit of javascript and CSS this html generate the following rendering (indicating the webservices endpoint as input type=submit :


Meaning that you can now serve a landing page that serve the purpose of human interaction, describing a « curl way » of automating interaction and a full model of your database.

Most startup think data model should be obfuscated to prevent being copied, most free software project thinks that sharing the non valuable assets helps adopt the technology.

And thanks to this, I can now create my own test suite that is using the HTML form to work on a doppleganger of the real database by parsing the HTML served by the application service (pdca.py) and launch a perfectly functioning service out of it:
from requests import post
from html.parser import HTMLParser

import requests
import os
from dateutil import parser
from passlib.hash import scrypt as crypto_hash # we can change the hash easily
from urllib.parse import parse_qsl, urlparse

# heaviweight
from requests import get
from sqlalchemy import *
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
DB=os.environ.get('DB','test.db')
DB_DRIVER=os.environ.get('DB_DRIVER','sqlite')
DSN=f"{DB_DRIVER}://{DB_DRIVER == 'sqlite' and not DB.startswith('/') and '/' or ''}{DB}"
ENDPOINT="http://127.0.0.1:5000"
os.chdir("..")
os.system(f"rm {DB}")
os.system(f"DB={DB} DB_DRIVER={DB_DRIVER} python pdca.py & sleep 2")
url = lambda table : ENDPOINT + "/" + table
os.system(f"curl {url('group')}?_action=search")

form_to_db = transtype_input = lambda attrs : {  k: (
                # handling of input having date/time in the name
                "date" in k or "time" in k and v and type(k) == str )
                    and parser.parse(v) or
                # handling of boolean mapping which input begins with "is_"
                k.startswith("is_") and [False, True][v == "on"] or
                # password ?
                "password" in k and crypto_hash.hash(v) or
                v
                for k,v in attrs.items() if v  and not k.startswith("_")
}

post(url("user"), params = dict(id=1,  secret_password="toto", name="jul2", email="j@j.com", _action="create"), files=dict(pic_file=open("./assets/diag.png", "rb").read())).status_code
#os.system(f"curl {ENDPOINT}/user?_action=search")
#os.system(f"sqlite3 {DB} .dump")

engine = create_engine(DSN)
metadata = MetaData()


transtype_true = lambda p : (p[0],[False,True][p[1]=="true"])
def dispatch(p):
    return dict(
        nullable=transtype_true,
        unique=transtype_true,
        default=lambda p:("server_default",eval(p[1])),
    ).get(p[0], lambda *a:None)(p)

transtype_input = lambda attrs : dict(filter(lambda x :x, map(dispatch, attrs.items())))

class HTMLtoData(HTMLParser):
    def __init__(self):
        global engine, tables, metadata
        self.cols = []
        self.table = ""
        self.tables= []
        self.enum =[]
        self.engine= engine
        self.meta = metadata
        super().__init__()

    def handle_starttag(self, tag, attrs):
        global tables
        attrs = dict(attrs)
        simple_mapping = {
            "email" : UnicodeText, "url" : UnicodeText, "phone" : UnicodeText,
            "text" : UnicodeText, "checkbox" : Boolean, "date" : Date, "time" : Time,
            "datetime-local" : DateTime, "file" : Text, "password" : Text, "uuid" : Text, #UUID is postgres specific
        }

        if tag in {"select", "textarea"}:
            self.enum=[]
            self.current_col = attrs["name"]
            self.attrs= attrs
        if tag == "option":
            self.enum.append( attrs["value"] )
        if tag == "unique_constraint":
            self.cols.append( UniqueConstraint(*attrs["col"].split(','), name=attrs["name"]) )
        if tag in { "input" }:
            if attrs.get("name") == "id":
                self.cols.append( Column('id', Integer,  **( dict(primary_key = True) | transtype_input(attrs ))))
                return
            try:
                if attrs.get("name").endswith("_id"):
                    table=attrs.get("name").split("_")
                    self.cols.append( Column(attrs["name"], Integer, ForeignKey(attrs["reference"])) )
                    return
            except Exception as e:
                log(e, ln=line())

            if attrs.get("type") in simple_mapping.keys() or tag in {"select",}:
                self.cols.append( 
                    Column(
                        attrs["name"], simple_mapping[attrs["type"]],
                        **transtype_input(attrs)
                    )
                )
            if attrs["type"] == "number":
                if attrs.get("step","") == "any":
                    self.cols.append( Columns(attrs["name"], Float) )
                else:
                    self.cols.append( Column(attrs["name"], Integer) )
        if tag== "form":
            self.table = urlparse(attrs["action"]).path[1:]

    def handle_endtag(self, tag):
        global tables
        if tag == "select":
            # self.cols.append( Column(self.current_col,Enum(*[(k,k) for k in self.enum]), **transtype_input(self.attrs)) )

            self.cols.append( Column(self.current_col, Text, **transtype_input(self.attrs)) )
            
        if tag == "textarea":
            self.cols.append(
                Column(
                    self.current_col,
                    String(int(self.attrs["cols"])*int(self.attrs["rows"])),
                    **transtype_input(self.attrs)) 
           )
        if tag=="form":
            self.tables.append( Table(self.table, self.meta, *self.cols), )
            #tables[self.table] = self.tables[-1]

            self.cols = []
            with engine.connect() as cnx:
                self.meta.create_all(engine)
                cnx.commit()

HTMLtoData().feed(get("http://127.0.0.1:5000/").text)
os.system("pkill -f pdca.py")



#metadata.reflect(bind=engine)
Base = automap_base(metadata=metadata)

Base.prepare()

with Session(engine) as session:
    for table,values in tuple([
        ("user", form_to_db(dict( name="him", email="j2@j.com", secret_password="toto"))),
        ("group", dict(id=1, name="trolol") ),
        ("group", dict(id=2, name="serious") ),
        ("user_group", dict(id=1,user_id=1, group_id=1, secret_token="secret")),
        ("user_group", dict(id=2,user_id=1, group_id=2, secret_token="")),
        ("user_group", dict(id=3,user_id=2, group_id=1, secret_token="")),
        ("statement", dict(id=1,user_group_id=1, message="usable agile workflow", category="story" )),
        ("statement", dict(id=2,user_group_id=1, message="How do we code?", category="story_item" )),
        ("statement", dict(id=3,user_group_id=1, message="which database?", category="question")),
        ("statement", dict(id=4,user_group_id=1, message="which web framework?", category="question")),
        ("statement", dict(id=5,user_group_id=1, message="preferably less", category="answer")),
        ("statement", dict(id=6,user_group_id=1, message="How do we test?", category="story_item" )),
        ("statement", dict(id=7,user_group_id=1, message="QA framework here", category="delivery" )),
        ("statement", dict(id=8,user_group_id=1, message="test plan", category="test" )),
        ("statement", dict(id=9,user_group_id=1, message="OK", category="finish" )),
        ("statement", dict(id=10, user_group_id=1, message="PoC delivered",category="delivery")),

        ("transition", dict( user_group_id=1, previous_statement_id=1, next_statement_id=2, message="something bugs me",is_exception=True, )),
        ("transition", dict( 
            user_group_id=1, 
            previous_statement_id=2, 
            next_statement_id=4, 
            message="standup meeting feedback",is_exception=True, )),
        ("transition", dict( 
            user_group_id=1, 
            previous_statement_id=2, 
            next_statement_id=3, 
            message="standup meeting feedback",is_exception=True, )),
        ("transition", dict( user_group_id=1, previous_statement_id=2, next_statement_id=6, message="change accepted",is_exception=True, )),
        ("transition", dict( user_group_id=1, previous_statement_id=4, next_statement_id=5, message="arbitration",is_exception=True, )),
        ("transition", dict( user_group_id=1, previous_statement_id=3, next_statement_id=5, message="arbitration",is_exception=True, )),
        ("transition", dict( user_group_id=1, previous_statement_id=6, next_statement_id=7, message="R&D", )),
        ("transition", dict( user_group_id=1, previous_statement_id=7, next_statement_id=8, message="Q&A", )),
        ("transition", dict( user_group_id=1, previous_statement_id=8, next_statement_id=9, message="CI action", )),
        ("transition", dict( user_group_id=1, previous_statement_id=2, next_statement_id=10, message="situation unblocked", )),
        ("transition", dict( user_group_id=1, previous_statement_id=9, next_statement_id=10, message="situation unblocked", )),
        ]):
        session.add(getattr(Base.classes,table)(**values))
        session.commit()
os.system("python ./generate_state_diagram.py sqlite:///test.db > out.dot ;dot -Tpng out.dot > diag2.png; xdot out.dot")
s = requests.session()

os.system(f"DB={DB} DB_DRIVER={DB_DRIVER} python pdca.py & sleep 1")


print(s.post(url("group"), params=dict(_action="delete", id=3,name=1)).status_code)
print(s.post(url("grant"), params = dict(secret_password="toto", email="j@j.com",group_id=1, )).status_code)
print(s.post(url("grant"), params = dict(_redirect="/group",secret_password="toto", email="j@j.com",group_id=2, )).status_code)
print(s.cookies["Token"])
print(s.post(url("user_group"), params=dict(_action="search", user_id=1)).text)
print(s.post(url("group"), params=dict(_action="create", id=3,name=2)).text)
print(s.post(url("group"), params=dict(_action="delete", id=3)).status_code)
print(s.post(url("group"), params=dict(_action="search", )).text)
os.system("pkill -f pdca.py")
Which give me a nice set of data to play with while I experiment on how to handle the business logic where the core of the value is.

November 20, 2024 04:04 AM UTC


Seth Michael Larson

SEGA Genesis & Mega Drive games and ROMs from Steam

SEGA Genesis & Mega Drive games and ROMs from Steam

AboutBlogCool URLs

SEGA Genesis & Mega Drive games and ROMs from Steam

Published 2024-11-20 by Seth Larson
Reading time: minutes

TDLR: SEGA is discontinuing the "SEGA Mega Drive and Genesis Classics" on December 6th. This is an affordable way to purchase these games and ROMs compared to the original cartridges. Buy games you are interested in while you still can.

In particular, Dr. Robotnik's Mean Bean Machine is one of my favorite games. I created copy-cat games when I was first learning how to program computers. I already own this game twice over as a Genesis cartridge and in the Sonic Mega Collection for the GameCube, but neither of those formats are easy to find the ROM itself to be played elsewhere.


So I heard you like beans.

That's where the SEGA Mega Drive and Genesis Classics comes in. This launcher provides uncompressed ROMs that are easily accessible after purchasing the game. For the below instructions, I am using Ubuntu 24.04 as my operating system. Here's what I did:

  • Download the Steam launcher for Linux.
  • Purchase Dr. Robotnik's Mean Bean Machine on Steam for $4.99 USD.
  • Download the "SEGA Mega Drive and Genesis Classics" launcher and the Dr. Robotnik's Mean Bean Machine "DLC". You don't have to launch the game through Steam.
  • Navigate to ~/.steam/steam/steamapps/common/Sega\ Classics/uncompressed\ ROMs.
  • ROM files can be found in this directory. Their file extension will be either .SGD or .68K. These can be changed to .bin to be recognized by emulators for Linux like Kega Fusion.
# How to mass-rename ROM extensions if you purchase multiple like I did:
$ for f in *.68K; do mv -- "$f" "${f%.68K}.bin"; done 
$ for f in *.SGD; do mv -- "$f" "${f%.SGD}.bin"; done

From here, you should be able to load these ROMs into any emulator. Happy gaming!

Have thoughts or questions? Let's chat over email or social:

sethmichaellarson@gmail.com
@sethmlarson@fosstodon.org

Want more articles like this one? Get notified of new posts by subscribing to the RSS feed or the email newsletter. I won't share your email or send spam, only whatever this is!

Want more content now? This blog's archive has ready-to-read articles. I also curate a list of cool URLs I find on the internet.

Find a typo? This blog is open source, pull requests are appreciated.

Thanks for reading! ♡ This work is licensed under CC BY-SA 4.0

November 20, 2024 12:00 AM UTC

November 19, 2024


PyCoder’s Weekly

Issue #656 (Nov. 19, 2024)

#656 – NOVEMBER 19, 2024
View in Browser »

The PyCoder’s Weekly Logo


How to Debug Your Textual Application

TUI applications require a full terminal which most IDEs don’t implement. To make matters more complicated, TUIs use the same calls that many command line debuggers use, making it hard to deal with breakpoints. This article teaches you how to debug a Textual TUI program.
MIKE DRISCOLL

Dictionary Comprehensions: How and When to Use Them

In this tutorial, you’ll learn how to write dictionary comprehensions in Python. You’ll also explore the most common use cases for dictionary comprehensions and learn about some bad practices that you should avoid when using them in your code.
REAL PYTHON

What We Learned From Analyzing 20.2 Million CI Jobs

alt

The Trunk Flaky Test public beta is open! You can now detect, quarantine, and eliminate flaky tests from your codebase. Discover insights from our analysis of 20.2 million CI jobs and see how Trunk can unblock pipelines and stop reruns. Access is free. Check out our getting started guide here →
TRUNK sponsor

Python Puzzles

A collection of Python puzzles. You are given a test file, and should write an implementation that passes the tests. All done in your browser.
GPTENGINEER.RUN

Announcing DjangoCon Europe 2025 in Dublin, Ireland

DJANGO SOFTWARE FOUNDATION

Flask 3.1 Released

PALLETSPROJECTS.COM

Quiz: Basic Input and Output in Python

REAL PYTHON

PEP 761: Deprecating PGP Signatures for CPython Artifacts (Approved)

PYTHON.ORG

Quiz: Using .__repr__() vs .__str__() in Python

REAL PYTHON

Discussions

Andrej Karpathy on Learning

Entertainment-based content may appear educational, but it is not effective for learning. To truly learn, one should seek out long-form, challenging content that requires effort and engagement. Educators should prioritize creating meaningful, in-depth content that fosters deep learning.
X.COM

Ideas: Turn shutil Into a Runnable Module

PYTHON.ORG

Articles & Tutorials

Maintaining the Foundations of Python & Cautionary Tales

How do you build a sustainable open-source project and community? What lessons can be learned from Python’s history and the current mess that the WordPress community is going through? This week on the show, we speak with Paul Everitt from JetBrains about navigating open-source funding and the start of the Python Software Foundation.
REAL PYTHON podcast

The Practical Guide to Scaling Django

Most Django scaling guides focus on theoretical maximums. But real scaling isn’t about handling hypothetical millions of users - it’s about systematically eliminating bottlenecks as you grow. Here’s how to do it right, based on patterns that work in production.
ANDREW

Build Your Own AI Assistant with Edge AI

Simplify workloads and elevate customer service. Build customized AI assistants that respond to voice prompts with powerful language and comprehension capabilities. Personalized AI assistance based on your unique needs with Intel’s OpenVINO toolkit.
INTEL CORPORATION sponsor

The Polars vs pandas Difference Nobody Is Talking About

When people compare pandas and Polars, they usually bring up topics such as lazy execution, Rust, null values, multithreading, and quey optimisation. Yet there’s one innovation which people often overlook: non-elementary group-by aggregations.
MARCO GORELLI • Shared by Marco Gorelli

PyPI Introduces Digital Attestations to Strengthen Security

PyPI now supports digital attestations. This feature lets Python package maintainers verify the authenticity and integrity of their uploads with cryptographically verifiable attestations, adding an extra layer of security and trust.
SARAH GOODING • Shared by Sarah Gooding

Django’s Technical Governance Challenges, and Opportunities

On October 29th, two DSF steering council members resigned, triggering an election earlier than planned. This note explains what that means and how you can get involved.
DJANGO SOFTWARE FOUNDATION

We’ve Moved to Hetzner

This post from Michael Kennedy talks about moving Talk Python’s hosting environment from Digital Ocean to Hetzner. It details everything involved in a move like this.
TALK PYTHON

Formatting Floats Inside Python F-Strings

In this video course, you’ll learn how to use Python format specifiers within an f-string to allow you to neatly format a float to your required precision.
REAL PYTHON course

Package Compatibility With Free-Threading and Subinterpreters

This tracker tests the compatibility of the 500 most popular packages with Python 3.13’s free-threading and subinterpreter features.
PYTHON.TIPS • Shared by Vita Midori

Projects & Code

chonkie: CHONK Your Texts With Chonkie

GITHUB.COM/BHAVNICKSM

seqlogic: Sequential Logic Simulator

GITHUB.COM/CJDRAKE

venvstacks: Virtual Environment Stacks for Python

GITHUB.COM/LMSTUDIO-AI

terminal-tree: Experimental Filesystem Navigator in Textual

GITHUB.COM/WILLMCGUGAN

chdb: An in-Process OLAP SQL Engine

GITHUB.COM/CHDB-IO

Events

Weekly Real Python Office Hours Q&A (Virtual)

November 20, 2024
REALPYTHON.COM

PyData Bristol Meetup

November 21, 2024
MEETUP.COM

PyLadies Dublin

November 21, 2024
PYLADIES.COM

PyConAU 2024

November 22 to November 27, 2024
PYCON.ORG.AU

Plone Conference 2024

November 25 to December 1, 2024
PLONECONF.ORG

Code, Configure and Deploy a Market Making Bot

November 25, 2024
MEETUP.COM

PyCon Wroclaw 2024

November 30 to December 1, 2024
PYCONWROCLAW.COM


Happy Pythoning!
This was PyCoder’s Weekly Issue #656.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

November 19, 2024 07:30 PM UTC


Python Insider

Python 3.14.0 alpha 2 released

Alpha 2? But Alpha 1 only just came out!

https://www.python.org/downloads/release/python-3140a2/

This is an early developer preview of Python 3.14

Major new features of the 3.14 series, compared to 3.13

Python 3.14 is still in development. This release, 3.14.0a2 is the second of seven planned alpha releases.

Alpha releases are intended to make it easier to test the current state of new features and bug fixes and to test the release process.

During the alpha phase, features may be added up until the start of the beta phase (2025-05-06) and, if necessary, may be modified or deleted up until the release candidate phase (2025-07-22). Please keep in mind that this is a preview release and its use is not recommended for production environments.

Many new features for Python 3.14 are still being planned and written. Among the new major new features and changes so far:

The next pre-release of Python 3.14 will be 3.14.0a3, currently scheduled for 2024-12-17.

More resources

Enjoy the new release

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organisation contributions to the Python Software Foundation.

Regards from a chilly Helsinki with snow on the way,

Your release team,
Hugo van Kemenade
Ned Deily
Steve Dower
Łukasz Langa

November 19, 2024 04:03 PM UTC


PyCharm

Code Faster with JetBrains AI in PyCharm

PyCharm 2024.3 comes with many improvements to JetBrains AI to help you code faster. I’m going to walk you through some of these updates in this blog post. 

Natural language inline AI prompt

You can now use JetBrains AI by typing straight into your editor in natural language without opening the AI Assistant tool window. If you use either IntelliJ IDEA or PyCharm, you might already be familiar with natural language AI prompts, but let me walk you through the process. 

If you’re typing in the gutter you can start typing your request straight into the editor, and then press Tab. Here’s an example of one such request:

write a script to capture a date input from a user and print it out prefixed by a message stating that their birthday is on that date.

You can then iterate on the initial input by clicking on the purple block in the gutter or by pressing ⌘\ or Ctrl+\ and pressing Enter:

add error handling so that when a birthday is in the future, we dont accept it

You can use  ⌘\ or Ctrl+\ to keep iterating until you’re happy with the result. For example, we can use the prompt:

print out the day of the week as well as their birthday date

And then: 

change the format of day_of_week to short

This feature is available for Python, JavaScript, TypeScript, JSON, and YAML files.

Let’s look at some more examples. We can get JetBrains AI Assistant to help us generate new code with a prompt like this:

Write code that lists the latest polls, shows poll details, handles voting, updates votes, and displays poll results, ensuring only published polls are accessible.

Or add some error handling to our code:

Add edge case handling to this code

Remember, context is everything. Where you start your natural language prompt is important, as PyCharm uses the placement of your caret to figure out the context. You don’t need to prefix your query with a ? or $ if you start typing in the gutter because the context is the file, but if your caret is indented, you’ll need to start your query with the ? or $ character so PyCharm knows you’re crafting a natural language query.

In this example, we want to refactor existing code, so we need to prefix our query with the ? character:

?create a dedicated function for printing the schedule and remove the code from here

Try JetBrains AI for free

Running code in the Python console

We know that JetBrains AI can generate code for you, but now you can run that code in the Python console without leaving the AI Assistant tool window by clicking the green run arrow.

For example, let’s say you have the following prompt:

Create a python script that asks for a birthday date in standard format yyy-MM-dd then converts it and prints it back out in a written format such as 22nd January 1991 

You can now click the green run arrow on the top-right of the code snippet to run it in your Python console:

Even more features

In addition to the new functionality for natural language and code completion for PyCharm highlighted above, there are several other improvements to JetBrains AI. 

Faster code completion

We have introduced a new model for faster cloud-based completion with AI Assistant which is showing very promising results.

Faster documentation

If documentation isn’t your thing, you can now hand off writing your Python docstrings to JetBrains AI. If you type either single or double quotes to enter a docstring and then press Return, you’ll see a prompt that says Generate with AI Assistant. Click that prompt and let JetBrains AI generate the documentation for you:

Help at your fingertips

We all need a little help now and again, and we can get JetBrains AI to help us here too. We’ve added a /docs prompt to the JetBrains AI tool window. This prompt will query the PyCharm documentation to save you from switching out of the context you’re working in!

Ability to choose your LLM

For AI Chat, you can now select a different LLM from the drop-down menu in the chat window itself. There are lots of options for you to choose from:

More context in Jupyter notebooks

We’ve also improved how JetBrains AI works for data scientists. JetBrains AI now recognizes DataFrames and variables in your notebook. You can prefix your DataFrame or variable with # so that JetBrains AI considers it as part of the context. 

Summary

JetBrains AI is available inside PyCharm, right where you need it. This release brings many improvements, from writing in natural language inside the editor and running AI-generated Python snippets in the console to generating documentation. 

Remember, if you’re in the gutter, you can start typing in natural language and then press Tab to get AI Assistant to generate the code. If you’re inside a method or function, you need to prefix your natural language query with either ? or $. You can then iterate on the generated code as many times as you like as you build out your new functionality and explore further.

Try JetBrains AI for free

November 19, 2024 03:25 PM UTC


Real Python

Working With TOML and Python

TOML—Tom’s Obvious Minimal Language—is a reasonably new configuration file format that the Python community has embraced over the last couple of years. TOML plays an essential part in the Python ecosystem. Many of your favorite tools rely on TOML for configuration, and you’ll use pyproject.toml when you build and distribute your own packages.

In this video course, you’ll learn more about TOML and how you can use it. In particular, you’ll:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 19, 2024 02:00 PM UTC


Mike Driscoll

How to Debug Your Textual Application

Textual is a great Python package for creating a lightweight, powerful, text-based user interface. That means you can create a GUI in your terminal with Python without learning curses! But what happens when you encounter some problems that require debugging your application? A TUI takes over your terminal, which means you cannot see anything from Python’s print() statement.

Wait? What about your IDE? Can that help? Actually no. When you run a TUI, you need a fully functional terminal to interact with it. PyCharm doesn’t work well with Textual. WingIDE doesn’t even have a terminal emulator. Visual Studio Code also doesn’t work out of the box, although you may be able to make it work with a custom json or yaml file. But what do you do if you can’t figure that out?

That is the crux of the problem and what you will learn about in this tutorial: How to debug Textual applications!

Getting Started

To get the most out of this tutorial, make sure you have installed Textual’s development tools by using the following command:

python -m pip install textual-dev --upgrade

Once you have the latest version of textual-dev installed, you may continue!

Debugging with Developer Mode

When you want to debug a Textual application, you need to open two terminal windows. On Microsoft Windows, you can open two Powershell or two Command Prompts. In the first terminal, run this command:

textual console

The Textual console will listen for any Textual application running in developer mode. But first, you need some kind of application to test with. Open up your favorite Python IDE and create a new file called hello_textual.py. Then enter the following code into it:

from textual.app import App, ComposeResult
from textual.widgets import Button


class WelcomeButton(App):

    def compose(self) -> ComposeResult:
        yield Button("Exit")

    def on_button_pressed(self) -> None:
        self.mount(Button("Other"))


if __name__ == "__main__":
    app = WelcomeButton()
    app.run()

To run a Textual application, use the other terminal you opened earlier. The one that isn’t running Textual Console in it. Then run this command:

textual run --dev hello_textual.py

You will see the following in your terminal:

Simple Textual app

If you switch over to the other terminal, you will see a lot of output that looks something like this:

Textual Console output

Now, if you want to test that you are reaching a part of your code in Textual, you can add a print() function now to your on_button_pressed() method. You can also use self.log.info() which you can read about in the Textual documentation.

Let’s update your code to include some logging:

from textual.app import App, ComposeResult
from textual.widgets import Button


class WelcomeButton(App):

    def compose(self) -> ComposeResult:
        yield Button("Exit")
        print("The compose() method was called!")

    def on_button_pressed(self) -> None:
        self.log.info("You pressed a button")
        self.mount(Button("Other"))


if __name__ == "__main__":
    app = WelcomeButton()
    app.run()

Now, when you run this code, you can check your Textual Console for output. The print() statement should be in the Console without you doing anything other than running the code. You must click the button to get the log statement in the Console.

Here is what the log output will look like in the Console:

Logging to the Textual Console

And here is an example of what you get when you print() to the Console:

Printing output to Textual Console

There’s not much difference here, eh? Either way, you get the information you need and if you need to print out Python objects, this can be a handy debugging tool.

If you find the output in the Console to be too verbose, you can use -x or --exclude to exclude log groups. Here’s an example:

textual console -x SYSTEM -x EVENT -x DEBUG -x INFO

In this version of the Textual Console, you are suppressing SYSTEM, EVENT, DEBUG, and INFO messages.

Launch your code from earlier and you will see that the output in your Console is greatly reduced:

Textual Console with output suppressed

Now, let’s learn how to use notification as a debugging tool.

Debugging with Notification

If you like using print() statements then you will love that Textual’s App() class provides a notify() method. You can call it anywhere in your application using self.app.notify() , along with a message. If you are in your App class, you can reduce the call to simply self.notify().

Let’s take the example from earlier and update it to use the notify method instead:

from textual.app import App, ComposeResult
from textual.widgets import Button


class WelcomeButton(App):

    def compose(self) -> ComposeResult:
        yield Button("Exit")

    def on_button_pressed(self) -> None:
        self.mount(Button("Other"))
        self.notify("You pressed the button!")


if __name__ == "__main__":
    app = WelcomeButton()
    app.run()

The notify() method takes the following parameters:

Try editing the notification to use more of these features. For example, you could update the code above to use this instead:

self.notify("You pressed the button!", title="Info Message", severity="error")

Textual’s App class also provides a bell() method you can call to play the system bell. You could add this to really get the user’s attention, assuming they have the system bell enabled on their computer.

Wrapping Up

Debugging your TUI application successfully is a skill. You need to know how to find errors, and Textual’s dev mode makes this easier. While it would be great if a Python IDE had a fully functional terminal built into it, that is a very niche need. So it’s great that Textual included the tooling you need to figure out your code.

Give these tips a try, and you’ll soon be able to debug your Textual applications easily!

The post How to Debug Your Textual Application appeared first on Mouse Vs Python.

November 19, 2024 01:09 PM UTC


Ned Batchelder

Loop targets

I posted a Python tidbit about how for loops can assign to other things than simple variables, and many people were surprised or even concerned:

Sample Python assigning to a dict item in a for loop, same as text below
params = {
    "query": QUERY,
    "page_size": 100,
}

# Get page=0, page=1, page=2, ...
for params["page"] in itertools.count():
    data = requests.get(SEARCH_URL, params).json()
    if not data["results"]:
        break
    ...

This code makes successive GET requests to a URL, with a params dict as the data payload. Each request uses the same data, except the “page” item is 0, then 1, 2, and so on. It has the same effect as if we had written it:

for page_num in itertools.count():
    params["page"] = page_num
    data = requests.get(SEARCH_URL, params).json()

One reply asked if there was a new params dict in each iteration. No, loops in Python do not create a scope, and never make new variables. The loop target is assigned to exactly as if it were an assignment statement.

As a Python Discord helper once described it,

While loops are “if” on repeat. For loops are assignment on repeat.

A loop like for <ANYTHING> in <ITER>: will take successive values from <ITER> and do an assignment exactly as this statement would: <ANYTHING> = <VAL>. If the assignment statement is ok, then the for loop is ok.

We’re used to seeing for loops that do more than a simple assignment:

for i, thing in enumerate(things):
    ...

for x, y, z in zip(xs, ys, zs):
    ...

These work because Python can assign to a number of variables at once:

i, thing = 0, "hello"
x, y, z = 1, 2, 3

Assigning to a dict key (or an attribute, or a property setter, and so on) in a for loop is an example of Python having a few independent mechanisms that combine in uniform ways. We aren’t used to seeing exotic combinations, but you can reason through how they would behave, and you would be right.

You can assign to a dict key in an assignment statement, so you can assign to it in a for loop. You might decide it’s too unusual to use, but it is possible and it works.

November 19, 2024 10:40 AM UTC


Zato Blog

IMAP and OAuth2 Integrations with Microsoft 365

IMAP and OAuth2 Integrations with Microsoft 365

Overview

This is the first in a series of articles about automation of and integrations with Microsoft 365 cloud products using Python and Zato.

We start off with IMAP automation by showing how to create a scheduled Python service that periodically pulls latest emails from Outlook using OAuth2-based connections.

IMAP and OAuth2

Microsoft 365 requires for all IMAP connections to use OAuth2. This can be challenging to configure in server-side automation and orchestration processes so Zato offers an easy way that lets you read and send emails without a need for getting into low-level OAuth2 details.

Consider a common orchestration scenario - a business partner sends automated emails with attachments that need to be parsed, some information needs to be extracted and processed accordingly.

Before OAuth2, an automation process would receive from Azure administrators a dedicated IMAP account with a username and password.

Now, however, in addition to creating an IMAP account, administrators will need to create and configure a few more resources that the orchestration service will use. Note that the password to the IMAP account will never be used.

Administrators need to:

Next, administrators need to give you a few pieces of information about the app:

Additionally, you still need to receive the IMAP username (an e-mail address). It is just that you do not need its corresponding password.

In Dashboard

The first step is to create a new connection in your Zato Dashboard - this will establish an OAuth2-using connection that Zato will manage and your Python code will not have to do anything else, all the underlying OAuth2 tokens will keep refreshing as needed, the platform will take care of everything.

Having received the configuration details from Azure administrators, you can open your Zato Dashboard and navigate to IMAP connections:

Fill out the form as below, choosing "Microsoft 365" as the server type. The other type, "Generic IMAP" is used for the classical case of IMAP with a username and password:

Change the secret and click Ping to confirm that the connection is configured correctly:

In Python

Use the code below to receive emails. Note that it merely needs to refer to a connection definition by its name and there is no need for any usage of OAuth2 here:

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class MyService(Service):

    def handle(self):

        # Connect to a Microsoft 365 IMAP connection by its name ..
        conn = self.email.imap.get('My Automation').conn

        # .. get all messages matching filter criteria ("unread" by default)..
        for msg_id, msg in conn.get():

            # .. and access each of them.
            self.logger.info(msg.data)

This is everything that is needed for integrations with IMAP using Microsoft 365 although we can still go further. For instance, to create a scheduled job to periodically invoke the service, go to the Scheduler job in Dashboard:

In this case, we decide to have a job that runs once per hour:

As expected, clicking OK will suffice for the job to start in background. It is as simple as that.

More resources

➤ Python API integration tutorial
What is an integration platform?
Python Integration platform as a Service (iPaaS)
What is an Enterprise Service Bus (ESB)? What is SOA?

November 19, 2024 08:00 AM UTC

November 18, 2024


PyCharm

JetBrains AI Assistant 2024.3 is here! A highlight of this release is the flexibility to choose your preferred chat model. Select between Google Gemini, OpenAI, or local models to tailor interactions for a more customized experience. 

This update also brings advanced code completion for all major programming languages, improved context management, and the ability to generate inline prompts directly within the editor.

More control over your chat experience: Choose between Gemini, OpenAI, and local models 

You can now select your preferred AI chat model, choosing from cloud model providers like Google Gemini and OpenAI, or connect to local models. This expanded selection allows you to customize the AI chat’s responses to your specific workflow, offering a more adaptable and personalized experience. 

Google’s Gemini models now available

The lineup of LLMs used by JetBrains AI now includes Gemini 1.5 Pro 002 and Flash 002. These models are designed to deliver advanced reasoning capabilities and optimized performance for a wide range of tasks. The Pro version excels in complex applications, while Flash is tailored for high-volume, low-latency scenarios. Now, AI Assistant users can leverage the power of Gemini models alongside our in-house Mellum and OpenAI options.

Local model support via Ollama

In addition to cloud-based models, you can now connect the AI chat to local models available through Ollama. This is particularly useful for users who need more control over their AI models, offering enhanced privacy, flexibility, and the ability to run models on local hardware.

To add an Ollama model to the chat you need to enable Ollama support in AI Assistant’s settings and configure the connection to your Ollama instance. 

Improved context management

In this update, we’ve made context handling in AI Assistant more transparent and intuitive. A revamped UI lets you view and manage every element included as context, providing full visibility and control. The open file and any selected code within it are now automatically added to the context, and you can easily add or remove files as needed, customizing the context to fit your workflow. Additionally, you can attach project-wide instructions to guide AI Assistant’s responses throughout your codebase.

Cloud code completion with broader language support

JetBrains has released its own large language model (LLM) model, Mellum, specifically designed to enhance cloud-based code completion for developers. This new model, specialized for coding tasks, has expanded support for several new languages, including JavaScript, TypeScript, HTML, C#, C, C++, Go, PHP, Scala, and Ruby. Now, the code completion experience is unified across JetBrains IDEs, offering syntax highlighting for suggested code, the flexibility to accept suggestions token by token or line by line, and overall reduced latency.

Local code completion enhancements: Multi-line support for Python and contextual improvements

Local code completion has significantly improved, now offering multi-line suggestions for Python. Additionally, optimizations have been made across other programming languages. For Kotlin, retrieval-augmented generation (RAG) enables the model to pull information from multiple project files, ensuring the most relevant suggestions. The support for JavaScript, TypeScript, and CSS has also seen enhancements to their existing RAG functionality. Furthermore, local code completion has been introduced for HTML.

These improvements mean that suggestions appear faster across all languages, creating a more seamless coding experience. Best of all, local code completion is included for free in your IDE, allowing you to start utilizing these powerful features immediately.

Streamlined in-editor experience with inline AI prompts

The new inline AI prompt feature in AI Assistant introduces a direct way to enter your prompts right in the editor. Just start typing your request in natural language and the AI Assistant will recognize it and generate a suggestion. Inline AI prompts are context-aware, automatically including related files and symbols for more accurate code generation. This feature supports Java, Kotlin, Scala, Groovy, JavaScript, TypeScript, Python, JSON, YAML, PHP, Ruby, and Go file formats, and is available to all AI Assistant users.

We also improved the visibility of changes applied. There is now a purple mark in the gutter next to lines changed by AI Assistant, so you can easily see what has been updated.

Make multiple file-wide updates easily

AI Assistant now offers file-wide code generation, enabling streamlined edits across an entire file. This functionality allows for modifications across multiple code sections, including adding necessary imports, updating references, and defining missing declarations.Currently available for Java and Kotlin, it is triggered by the Generate Code action when no specific selection is made in the editor, offering a seamless experience for broad, file-wide adjustments.

Get instant answers about IDE features and settings in AI Chat

Say goodbye to searching through settings or documentation! With the new /docs command, you can now access documentation-based answers directly in the AI chat. Simply ask AI Assistant about a feature, and it will provide interactive step-by-step guidance.

AI-powered quick-fix for faster error resolution

When a JetBrains IDE inspection flags a problem – whether it’s a syntax error, missing import, or something else – it suggests a quick-fix directly within the editor. With the latest update, Fix with AI takes this a step further. This new capability uses AI context awareness to suggest fixes that are more precise and applicable to your specific coding context, making it faster and easier to resolve coding problems without any manual input.

Explore AI Assistant and share your feedback

Explore these updates and let AI Assistant streamline your development workflow even further. As always, we look forward to hearing your feedback. You can also tell us about your experience via the Share your feedback link in the AI Assistant tool window or by submitting feature requests or bug reports in YouTrack.

Happy developing!

November 18, 2024 07:41 PM UTC


Python Morsels

Python's pathlib module

Python's pathlib module is the tool to use for working with file paths. See pathlib quick reference tables and examples.

Table of contents

  1. A pathlib cheat sheet
  2. The open function accepts Path objects
  3. Why use a pathlib.Path instead of a string?
  4. The basics: constructing paths with pathlib
  5. Joining paths
  6. Current working directory
  7. Absolute paths
  8. Splitting up paths with pathlib
  9. Listing files in a directory
  10. Reading and writing a whole file
  11. Many common operations are even easier
  12. No need to worry about normalizing paths
  13. Built-in cross-platform compatibility
  14. A pathlib conversion cheat sheet
  15. What about things pathlib can't do?
  16. Should strings ever represent file paths?
  17. Use pathlib for readable cross-platform code

A pathlib cheat sheet

Below is a cheat sheet table of common pathlib.Path operations.

The variables used in the table are defined here:

>>> from pathlib import Path
>>> path = Path("/home/trey/proj/readme.md")
>>> relative = Path("readme.md")
>>> base = Path("/home/trey/proj")
>>> new = Path("/home/trey/proj/sub")
>>> home = Path("/home/")
>>> target = path.with_suffix(".txt")  # .md -> .txt
>>> pattern = "*.md"
>>> name = "sub/f.txt"
Path-related task pathlib approach Example
Read all file contents path.read_text() 'Line 1\nLine 2\n'
Write file contents path.write_text('new') Writes new to file
Get absolute path relative.resolve() Path('/home/trey/proj/readme.md')
Get the filename path.name 'readme.md'
Get parent directory path.parent Path('home/trey/proj')
Get file extension path.suffix '.md'
Get suffix-free name path.stem 'readme'
Ancestor-relative path path.relative_to(base) Path('readme.md')
Verify path is a file path.is_file() True
Verify path is directory path.is_dir() False
Make new directory new.mkdir() Makes new directory
Get current directory Path.cwd() Path('/home/trey/proj')
Get home directory Path.home() Path('/home/trey')
Get ancestor paths path.parents [Path('/home/trey/proj'), ...]
List files/directories home.iterdir() [Path('home/trey')]
Find files by pattern base.glob(pattern) [Path('/home/trey/proj/readme.md')]
Find files recursively base.rglob(pattern) [Path('/home/trey/proj/readme.md')]
Join path parts base / name Path('/home/trey/proj/sub/f.txt')
Get file size (bytes) path.stat().st_size 14
Walk the file tree base.walk() Iterable of (path, subdirs, files)
Rename path path.rename(target) Path object for new path
Remove file path.unlink()

Note that iterdir, glob, rglob, and walk all return iterators. The examples above show lists for convenience.

The open function accepts Path objects

What does Python's open function …

Read the full article: https://www.pythonmorsels.com/pathlib-module/

November 18, 2024 05:00 PM UTC


Real Python

Interacting With Python

There are multiple ways of interacting with Python, and each can be useful for different scenarios. You can quickly explore functionality in Python’s interactive mode using the built-in Read-Eval-Print Loop (REPL), or you can write larger applications to a script file using an editor or Integrated Development Environment (IDE).

In this tutorial, you’ll learn how to:

  • Use Python interactively by typing code directly into the interpreter
  • Execute code contained in a script file from the command line
  • Work within a Python Integrated Development Environment (IDE)
  • Assess additional options, such as the Jupyter Notebook and online interpreters

Before working through this tutorial, make sure that you have a functioning Python installation at hand. Once you’re set up with that, it’s time to write some Python code!

Get Your Code: Click here to get the free sample code that you’ll use to learn about interacting with Python.

Take the Quiz: Test your knowledge with our interactive “Interacting With Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Interacting With Python

In this quiz, you'll test your understanding of the different ways of interacting with Python. By working through this quiz, you'll revisit key concepts related to Python interaction in interactive mode using the REPL, through Python script files, and within IDEs and code editors.

Hello, World!

There’s a long-standing custom in computer programming that the first code written in a newly installed language is a short program that displays the text Hello, World! to the console.

In Python, running a “Hello, World!” program only takes a single line of code:

Python
print("Hello, World!")
Copied!

Here, print() will display the text Hello, World! in quotes to your screen. In this tutorial, you’ll explore several ways to execute this code.

Running Python in Interactive Mode

The quickest way to start interacting with Python is in a Read-Eval-Print Loop (REPL) environment. This means starting up the interpreter and typing commands to it directly.

When you interact with Python in this way, the interpreter will:

  • Read the command you enter
  • Evaluate and execute the command
  • Print the output (if any) to the console
  • Loop back and repeat the process

The interactive session continues like this until you instruct the interpreter to stop. Using Python in this interactive mode is a great way to test short snippets of Python code and get more familiar with the language.

When you install Python using an installer, the Start menu shows a program group labeled Python 3.x. The label may vary depending on the particular installation you chose. Click on that item to start the Python interpreter.

Alternatively, you can open your Command Prompt or PowerShell application and type the py command to launch it:

Windows PowerShell
PS> py
Copied!

To start the Python interpreter, open your Terminal application and type python3 to launch it from the command line:

Shell
$ python3
Copied!

If you’re unfamiliar with this application, then you can use your operating system’s search function to find it.

After pressing Enter, you should see a response from the Python interpreter similar to the one below:

Python
Python 3.13.0 (main, Oct 14 2024, 10:34:31) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
Copied!

If you’re not seeing the >>> prompt, then you’re not talking to the Python interpreter. This could be because Python is either not installed or not in the path of your terminal window session.

Note: If you need additional help to get to this point, then you can check out the How to Install Python on Your System: A Guide tutorial.

If you’re seeing the prompt, then you’re off and running! With these next steps, you’ll execute the statement that displays "Hello, World!" to the console:

  1. Ensure that Python displays the >>> prompt, and that you position your cursor after it.
  2. Type the command print("Hello, World!") exactly as shown.
  3. Press the Enter key.

Read the full article at https://realpython.com/interacting-with-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 18, 2024 02:00 PM UTC


Go Deh

There's the easy way...

 

Best seen on a larger than landscape phone

Someone blogged about a particular problem:

From: https://theweeklychallenge.org/blog/perl-weekly-challenge-294/#TASK1

Given an unsorted array of integers, `ints`
Write a script to return the length of the longest consecutive elements sequence.
Return -1 if none found. *The algorithm must run in O(n) time.*

The solution they blogged used a sort which meant it could not be O(n) in time, but the problem looked good so I gave it some thought.

Sets! sets are O(1) in Python and are good for looking things up.

What if when looking at the inputted numbers, one at a time, you also looked for other ints in the input that would extend the int you have to form a longer  range?  Keep tab of the longest range so far and if you remove ints from the pool as they form ranges, when the pool is empty, you should know the longest range.

I added the printout of the longest range too.

My code

def consec_seq(ints) -> tuple[int, int, int]:
    "Extract longest_seq_length, its_min, its_max"
    pool = set(ints)
    longest, longest_mn, longest_mx = 0, 1, 0
    while pool:
        this = start = pool.pop()
        ln = 1
        # check down
        while (this:=(this - 1)) in pool:
            ln += 1
            pool.remove(this)
        mn = this + 1
        # check up
        this = start
        while (this:=(this + 1)) in pool:
            ln += 1
            pool.remove(this)
        mx = this - 1
        # check longest
        if ln > longest:
            longest, longest_mn, longest_mx = ln, mn, mx

    return longest,longest_mn,longest_mx

def _test():
    for ints in[(),
            (69,),
            (-20, 78, 79, 1, 100),
            (10, 4, 20, 1, 3, 2),
            (0, 6, 1, 8, 5, 2, 4, 3, 0, 7),
            (10, 30, 20),
            (2,4,3,1,0, 10,12,11,8,9),  # two runs of five
            (10,12,11,8,9, 2,4,3,1,0),  # two runs of five - reversed
            (2,4,3,1,0,-1, 10,12,11,8,9),  # runs of 6 and 5
            (2,4,3,1,0, 10,12,11,8,9,7),   # runs of 5 and 6
            ]:
        print(f"Input {ints = }")
        longest, longest_mn, longest_mx = consec_seq(ints)

        if longest <2:
            print("  -1")
        else:
            print(f"  The/A longest sequence has {longest} elements {longest_mn}..{longest_mx}")


# %%
if __name__ == '__main__':
    _test()

Sample output

Input ints = ()
  -1
Input ints = (69,)
  -1
Input ints = (-20, 78, 79, 1, 100)
  The/A longest sequence has 2 elements 78..79
Input ints = (10, 4, 20, 1, 3, 2)
  The/A longest sequence has 4 elements 1..4
Input ints = (0, 6, 1, 8, 5, 2, 4, 3, 0, 7)
  The/A longest sequence has 9 elements 0..8
Input ints = (10, 30, 20)
  -1
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9)
  The/A longest sequence has 5 elements 0..4
Input ints = (10, 12, 11, 8, 9, 2, 4, 3, 1, 0)
  The/A longest sequence has 5 elements 0..4
Input ints = (2, 4, 3, 1, 0, -1, 10, 12, 11, 8, 9)
  The/A longest sequence has 6 elements -1..4
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9, 7)
  The/A longest sequence has 6 elements 7..12

Another Algorithm

What if, you kept and extended ranges untill you amassed all ranges then chose the longest? I need to keep the hash lookup. dict key lookup should also be O(1).  What to look up? Look up ints that would extend a range!

If you have an existing (integer) range, say 1..3 inclusive of end points then finding 0 would extend the range to 0..3 or finding one more than the range maximum, 4 would extend the original range to 1..4 

So if you have ranges then they could be extended by finding rangemin - 1 or rangemax +1. I call then extends

If you do find that the next int from the input ints is also an extends value then you need to find the range that it extends, (by lookup), so you can modify that range. - use a dict to map extends to their range and checking if an int is in the extends dict keys should also take O(1) time.

I took that sketch of an algorithm and started to code. It took two evenings to finally get something that worked and I had to work out several details that were trying. The main problem was what about coalescing ranges? if you have ranges 1..2 and 4..5 what happens when you see a 3? the resultant is the single range 1..5. It took particular test cases and extensive debugging to work out that the extends2range mapping should map to potentially more than one range and that you need to combine ranges if two of them are present for any extend value being hit.


So for 1..2 the extends being looked for are 0 and 3. For 4..5 the extends being looked for are 3, again, and 6. The extends2ranges data structure for just this should look like:

{0: [[1, 2]],
 3: [[1, 2], [4, 5]],
 6: [[4, 5]]}

 

The Code #2

from collections import defaultdict


def combine_ranges(min1, max1, min2, max2):
    "Combine two overlapping ranges return the new range as [min, max], and a set of limits unused in the result"
    assert (min1 <= max1 and min2 <= max2          # Well formed
            and ( min1 <= max2 and min2 <= max1 )) # and ranges touch or overlap
    range_limits = set([min1, max1, min2, max2])
    new_mnmx = [min(range_limits), max(range_limits)]
    unused_limits = range_limits - set(new_mnmx)

    return new_mnmx, unused_limits

def consec_seq2(ints) -> tuple[int, int, int]:
    "Extract longest_seq_length, its_min, its_max"
    if not ints:
        return -1, 1, -1
    seen = set()  # numbers seen so far
    extends2ranges = defaultdict(list)  # map extends to its ranges
    for this in ints:
        if this in seen:
            continue
        else:
            seen.add(this)

        if this not in extends2ranges:
            # Start new range
            mnmx = [this, this]    # Range of one int
            # add in the extend points
            extends2ranges[this + 1].append(mnmx)
            extends2ranges[this - 1].append(mnmx)
        else:
            # Extend an existing range
            ranges = extends2ranges[this]  # The range(s) that could be extended by this
            if len(ranges) == 2:
                # this joins the two ranges
                extend_and_join_ranges(extends2ranges, this, ranges)
            else:
                # extend one range, copied
                extend_and_join_ranges(extends2ranges, this, [ranges[0], ranges[0].copy()])

    all_ranges = sum(extends2ranges.values(), start=[])
    longest_mn, longest_mx = max(all_ranges, key=lambda mnmx: mnmx[1] - mnmx[0])

    return (longest_mx - longest_mn + 1), longest_mn, longest_mx

def extend_and_join_ranges(extends2ranges, this, ranges):
    mnmx, mnmx2 = ranges
    mnmx_orig, mnmx2_orig = mnmx.copy(), mnmx2.copy() # keep copy of originals
    mn, mx = mnmx
    mn2, mx2 = mnmx2
    if this == mn - 1:
        mnmx[0] = mn = this  # Extend lower limit of the range
    if this == mn2 - 1:
        mnmx2[0] = mn2 = this  # Extend lower limit of the range
    if this == mx + 1:
        mnmx[1] = mx = this  # Extend upper limit of the range
    if this == mx2 + 1:
        mnmx2[1] = mx2 = this  # Extend lower limit of the range
    new_mnmx, _unused_limits = combine_ranges(mn, mx, mn2, mx2)

    remove_merged_from_extends(extends2ranges, this, mnmx, mnmx2)
    add_combined_range_to_extends(extends2ranges, new_mnmx)


def add_combined_range_to_extends(extends2ranges, new_mnmx):
    "Add in the combined of two ranges's extends"
    new_mn, new_mx = new_mnmx
    for extend in (new_mn - 1, new_mx + 1):
        r = extends2ranges[extend]  # ranges at new limit extension
        if new_mnmx not in r:
            r.append(new_mnmx)

def remove_merged_from_extends(extends2ranges, this, mnmx, mnmx2):
    "Remove original ranges that were merged from extends"
    for lohi in (mnmx, mnmx2):
        lo, hi = lohi
        for extend in (lo - 1, hi + 1):
            if extend in extends2ranges:
                r = extends2ranges[extend]
                for r_old in (mnmx, mnmx2):
                    if r_old in r:
                        r.remove(r_old)
                if not r:
                    del extends2ranges[extend]
    # remove joining extend, this
    del extends2ranges[this]


def _test():
    for ints in[
            (),
            (69,),
            (-20, 78, 79, 1, 100),
            (4, 1, 3, 2),
            (10, 4, 20, 1, 3, 2),
            (0, 6, 1, 8, 5, 2, 4, 3, 0, 7),
            (10, 30, 20),
            (2,4,3,1,0, 10,12,11,8,9),  # two runs of five
            (10,12,11,8,9, 2,4,3,1,0),  # two runs of five - reversed
            (2,4,3,1,0,-1, 10,12,11,8,9),  # runs of 6 and 5
            (2,4,3,1,0, 10,12,11,8,9,7),   # runs of 5 and 6
            ]:
        print(f"Input {ints = }")
        longest, longest_mn, longest_mx = consec_seq2(ints)

        if longest <2:
            print("  -1")
        else:
            print(f"  The/A longest sequence has {longest} elements {longest_mn}..{longest_mx}")


# %%
if __name__ == '__main__':
    _test()


Its Output

 Input ints = ()
  -1
Input ints = (69,)
  -1
Input ints = (-20, 78, 79, 1, 100)
  The/A longest sequence has 2 elements 78..79
Input ints = (4, 1, 3, 2)
  The/A longest sequence has 4 elements 1..4
Input ints = (10, 4, 20, 1, 3, 2)
  The/A longest sequence has 4 elements 1..4
Input ints = (0, 6, 1, 8, 5, 2, 4, 3, 0, 7)
  The/A longest sequence has 9 elements 0..8
Input ints = (10, 30, 20)
  -1
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9)
  The/A longest sequence has 5 elements 0..4
Input ints = (10, 12, 11, 8, 9, 2, 4, 3, 1, 0)
  The/A longest sequence has 5 elements 8..12
Input ints = (2, 4, 3, 1, 0, -1, 10, 12, 11, 8, 9)
  The/A longest sequence has 6 elements -1..4
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9, 7)
  The/A longest sequence has 6 elements 7..12


This second algorithm gives correct results but is harder to develop and explain. It's a testament to my stubbornness as I thought there was a solution there, and debugging was me flexing my skills to keep them honed.


END.


November 18, 2024 10:11 AM UTC


Python Software Foundation

Help power Python and join in the PSF year-end fundraiser & membership drive!

Join the PSF 2024 Fundraiser & Membership Drive

The Python Software Foundation (PSF) is the charitable organization behind Python, dedicated to advancing, supporting, and protecting the Python programming language and the community that sustains it. That mission and cause are more than just words we believe in. Our tiny but mighty team works hard to deliver the projects and services that allow Python to be the thriving, independent, community-driven language it is today. Some of what the PSF does includes producing PyCon US, hosting the Python Packaging Index (PyPI), awarding grants to Python initiatives worldwide, maintaining critical community infrastructure, and more.

To build the future of Python and sustain the thriving community that its users deserve, we need your help. By backing the PSF, you’re investing in Python’s growth and health, and your contributions directly impact the language's future. Is your community, work, or hobby powered by Python? Join this year’s drive and power Python’s future with us by donating or becoming a Supporting Member today.

There are three ways to join in:

  1. Save on PyCharm! JetBrains is once again supporting the PSF by providing a 30% discount on PyCharm and ALL proceeds will go to the PSF! You can take advantage of this discount by clicking the button on the PyCharm promotion page and the discount will be automatically applied when you check out. The promotion will only be available through November 30th, 2024, so make sure to grab the deal today!

  2. Donate to the PSF! Your donation is a direct way to support and power the future of the Python programming language and community you love. Every dollar makes a difference.

  3. Become a Supporting member! When you sign up as a Supporting Member of the PSF, you become a part of the PSF, are eligible to vote in PSF elections and help us sustain what we do with your annual support. You can sign up as a Supporting Member at the usual annual rate($99 USD), or you can take advantage of our sliding scale option (starting at $25 USD)! We don't want cost to be a barrier to you being a part of the PSF or to your voice helping direct our future. Every PSF member makes the Python community stronger!

  4. Your donations:


      Highlights from 2024:

       
      Our thanks!

      Every dollar you contribute to the PSF helps power Python, makes an impact, and tells us you value Python and the work we do. Python and the PSF are built on the amazing generosity and energy of all our amazing community members out there who step up and give back.

      We appreciate you and we’re so excited to see where we can go together in the year to come!

      November 18, 2024 09:56 AM UTC


      Python Bytes

      #410 Entering the Django core

      <strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://buttondown.com/carlton/archive/thoughts-on-djangos-core/?featured_on=pythonbytes">Thoughts on Django’s Core</a></strong></li> <li><strong><a href="https://pypi.org/project/futurepool/?featured_on=pythonbytes">futurepool</a></strong></li> <li><strong><a href="https://snarky.ca/dont-use-named-tuples-in-new-apis/?featured_on=pythonbytes">Don't return named tuples in new APIs</a></strong></li> <li><strong><a href="https://ziglang.org/news/migrate-to-self-hosting/?featured_on=pythonbytes">Ziglang: Migrating from AWS to Self-Hosting</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=j-q31u9G3Ds' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="410">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a> <strong>/</strong> <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes"><strong>@mkennedy.codes</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a> <strong>/</strong> <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes"><strong>@brianokken.bsky.social</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a> <strong>/</strong> <a href="https://bsky.app/profile/pythonbytes.bsky.social"><strong>@pythonbytes.bsky.social</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. </p> <p><strong>Brian #1:</strong> <a href="https://buttondown.com/carlton/archive/thoughts-on-djangos-core/?featured_on=pythonbytes">Thoughts on Django’s Core</a></p> <ul> <li>Carlton Gibson</li> <li>Great discussion on <ul> <li>Django and Core vs Plugins</li> <li>Sustainability with limited people</li> <li>Keeping core small</li> <li>The release cycle</li> <li>eembrace plugins vs endorsing plugins.</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href="https://pypi.org/project/futurepool/?featured_on=pythonbytes">futurepool</a></p> <ul> <li>via Pat Decker</li> <li>Takes the concept of multiprocessing Pool to the async/await world.</li> <li><p>Create a pool then delegate the work:</p> <pre><code>async with FuturePool(2) as fp: result = await fp.map(async_pool_fn, range(10)) </code></pre></li> <li><p>I would LOVE to see something like this in a broader background asyncio worker pool concept.</p></li> <li>But that concept doesn’t exist in asyncio in Python and that’s a failing of the framework IMO.</li> </ul> <p><strong>Brian #3:</strong> <a href="https://snarky.ca/dont-use-named-tuples-in-new-apis/?featured_on=pythonbytes">Don't return named tuples in new APIs</a></p> <ul> <li>Brett Cannon</li> <li>First off, I’m grateful for any post that talks about APIs and the API is a module, class, or package API and not a Web/REST API. The term API existed long before the internet.</li> <li>“e.g., get_mouse_position() very likely has a two-item tuple of X and Y coordinates of the screen”</li> <li>“it actually makes your API more complex for both you and your users <em>to use</em>. For you, it doubles the data access API surface for your return type as you have to now support index-based and attribute-based data access forever (or until you choose to break your users and change your return type so it doesn't support both approaches)”</li> <li>“… you probably don't want people doing with your return type, like slicing, iterating over all the items …”</li> <li>Alternatives <ul> <li>class</li> <li>dataclass</li> <li>dictionary</li> <li>TypedDict</li> <li>SimpleNamespace</li> </ul></li> <li>“My key point in all of this is to prefer readability and ergonomics over brevity in your code. That means avoiding named tuples except where you are expanding to tweaking an existing API where the named tuple improves over the plain tuple that's already being used.”</li> </ul> <p><strong>Michael #4:</strong> <a href="https://ziglang.org/news/migrate-to-self-hosting/?featured_on=pythonbytes">Ziglang: Migrating from AWS to Self-Hosting</a></p> <ul> <li>The Rust Foundation for example, reports that they spent $404,400 on infrastructure costs in 2023.</li> <li>Zig lang has decided to use a single big cloud machine + mirrors</li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li>Changing the Python Test community <ul> <li>Was started to answer questions for Test &amp; Code listeners years ago. </li> <li>Primarily pytest questions</li> <li>Used to be Slack. Then moved to Podia forum. </li> <li>Now I’m trying to work out a Discord solution that is both sustainable and usable.</li> </ul></li> </ul> <p>Michael:</p> <ul> <li><a href="https://bsky.app/profile/wang.social/post/3lb346uyzdc2r?featured_on=pythonbytes">PWang Bsky essay</a></li> <li><a href="https://theworkitem.com/blog/building-a-business-from-python-expertise-michael-kennedy/?featured_on=pythonbytes">Building A Business From Python Expertise - Michael Kennedy on Work Item Podcast</a></li> <li>Subscribe to package releases, just put .atom on the end of their releases URL, for example: <ul> <li><a href="https://github.com/mikeckennedy/jinja_partials/releases?featured_on=pythonbytes">github.com/mikeckennedy/jinja_partials/releases</a> ← add .atom for RSS</li> </ul></li> <li><a href="https://pypi.org/project/pytest-bdd/8.0.0/#data">pytest-bdd 8.0.0</a> was just released via Jamie Thomson <ul> <li>The big feature (in Jamie’s opinion) is the addition of data tables https://github.com/pytest-dev/pytest-bdd/blob/master/CHANGES.rst#800---2024-11-14</li> </ul></li> </ul> <p><strong>Joke:</strong> <a href="https://devhumor.com/media/breaking-javascript-developer-commits-to-framework-for-record-breaking-3-weeks?featured_on=pythonbytes">Breaking: JavaScript Developer Commits to Framework for Record-Breaking 3 Weeks</a></p>

      November 18, 2024 08:00 AM UTC


      James Bennett

      Introducing DjangoVer

      Version numbering is hard, and there are lots of popular schemes out there for how to do it. Today I want to talk about a system I’ve settled on for my own Django-related packages, and which I’m calling “DjangoVer”, because it ties the version number of a Django-related package to the latest Django version that package supports.

      But one quick note to start with: this is not really “introducing” the idea of DjangoVer, because I know I’ve used the name a few times already in other places. I’m also not the person who invented this, and I don’t know for certain who did — I’ve seen several packages which appear to follow some form of DjangoVer and took inspiration from them in defining my own take on it.

      Django’s version scheme: an overview

      The basic idea of DjangoVer is that the version number of a Django-related package should tell you which version of Django you can use it with. Which probably doesn’t help much if you don’t know how Django releases are numbered, so let’s start there. In brief:

      This has been in effect since Django 2.0 was released, and the feature releases have been: 2.0, 2.1, 2.2 (LTS); 3.0, 3.1, 3.2 (LTS); 4.0, 4.1, 4.2 (LTS); 5.0, 5.1. Django 5.2 (LTS) is expected in April 2025, and then eight months later (if nothing is changed) will come Django 6.0.

      I’ll talk more about SemVer in a bit, but it’s worth being crystal clear that Django does not follow Semantic Versioning, and the MAJOR number is not a signal about API compatibility. Instead, API compatibility runs LTS-to-LTS, with a simple principle: if your code runs on a Django LTS release and raises no deprecation warnings, it will run unmodified on the next LTS release. So, for example, if you have an application that runs without deprecation warnings on Django 4.2 LTS, it will run unmodified on Django 5.2 LTS (though at that point it might begin raising new deprecation warnings, and you’d need to clear them before it would be safe to upgrade any further).

      DjangoVer, defined

      In DjangoVer, a Django-related package has a version number of the form DJANGO_MAJOR.DJANGO_FEATURE.PACKAGE_VERSION, where DJANGO_MAJOR and DJANGO_FEATURE indicate the most recent feature release series of Django supported by the package, and PACKAGE_VERSION begins at zero and increments by one with each release of the package supporting that feature release of Django.

      Since the version number only indicates the newest Django feature release supported, a package using DjangoVer should also use Python package classifiers to indicate the full range of its Django support (such as Framework :: Django :: 5.1 to indicate support for Django 5.1 — see examples on PyPI).

      But while Django takes care to maintain compatibility from one LTS to the next, I do not think DjangoVer packages need to do that; they can use the simpler approach of issuing deprecation warnings for two releases, and then making the breaking change. One of the stated reasons for Django’s LTS-to-LTS compatibility policy is to help third-party packages have an easier time supporting Django releases that people are actually likely to use; otherwise, Django itself generally just follows the “deprecate for two releases, then remove it” pattern. No matter what compatibility policy is chosen, however, it should be documented clearly, since DjangoVer explicitly does not attempt to provide any information about API stability/compatibility in the version number.

      That’s a bit wordy, so let’s try an example:

      Why another version system?

      Some of you probably didn’t even read this far before rushing to instantly post the XKCD “Standards” comic as a reply. Thank you in advance for letting the rest of us know we don’t need to bother listening to or engaging with you. For everyone else: here’s why I think in this case adding yet another “standard” is actually a good idea.

      The elephant in the room here is Semantic Versioning (“SemVer”). Others have written about some of the problems with SemVer, but I’ll add my own two cents here: “compatibility” is far too complex and nebulous a concept to be usefully encoded in a simple value like a version number. And if you want my really cynical take, the actual point of SemVer in practice is to protect developers of software from users, by providing endless loopholes and ways to say “sure, this change broke your code, but that doesn’t count as a breaking change”. It’ll turn out that the developer had a different interpretation of the documentation than you did, or that the API contract was “underspecified” and now has been “clarified”, or they’ll just throw their hands up, yell “Hyrum’s Law” and say they can’t possibly be expected to preserve that behavior.

      A lot of this is rooted in the belief that changes, and especially breaking changes, are inherently bad and shameful, and that if you introduce them you’re a bad developer who should be ashamed. Which is, frankly, bullshit. Useful software almost always evolves and changes over time, and it’s unrealistic to expect it not to. I wrote about this a few years back in the context of the Python 2/3 transition:

      Though there is one thing I think gets overlooked a lot: usually, the anti-Python-3 argument is presented as the desire of a particular company, or project, or person, to stand still and buck the trend of the world to be ever-changing.

      But really they’re asking for the inverse of that. Rather than being a fixed point in a constantly-changing world, what they really seem to want is to be the only ones still moving in a world that has become static around them. If only the Python team would stop fiddling with the language! If only the maintainers of popular frameworks would stop evolving their APIs! Then we could finally stop worrying about our dependencies and get on with our real work! Of course, it’s logically impossible for each one of those entities to be the sole mover in a static world, but pointing that out doesn’t always go well.

      But that’s a rant for another day and another full post all its own. For now it’s enough to just say I don’t believe SemVer can ever deliver on what it promises. So where does that leave us?

      Well, if the version number can’t tell you whether it’s safe to upgrade from one version to another, perhaps it can still tell you something useful. And for me, when I’m evaluating a piece of third-party software for possible use, one of the most important things I want to know is: is someone actually maintaining this? There are lots of potential signals to look for, but some version schemes — like CalVer — can encode this into the version number. Want to know if the software’s maintained? With CalVer you can guess a package’s maintenance status, with pretty good accuracy, from a glance at the version number.

      Over the course of this year I’ve been transitioning all my personal non-Django packages to CalVer for precisely this reason. Compatibility, again, is something I think can’t possibly be encoded into a version number, but “someone’s keeping an eye on this” can be. Even if I’m not adding features to something, Python itself does a new version every year and I’ll push a new release to explicitly mark compatibility (as I did recently for the release of Python 3.13). That’ll bump the version number and let anyone who takes a quick glance at it know I’m still there and paying attention to the package.

      For packages meant to be used with Django, though, the version number can usefully encode another piece of information: not just “is someone maintaining this”, but “can I use this with my Django installation”. And that is what DjangoVer is about: telling you at a glance the maintenance and Django compatibility status of a package.

      DjangoVer in practice

      All of my own personal Django-related packages are now using DjangoVer, and say so in their documentation. If I start any new Django-related projects they’ll do the same thing.

      A quick scroll through PyPI turns up other packages doing something that looks similar; django-cockroachdb and django-snowflake, for example, versioned their Django 5.1 packages as “5.1”, and explicitly say in their READMEs to install a package version corresponding to the Django version you use (they also have a maintainer in common, who I suspect of having been an early inventor of what I’m now calling “DjangoVer”).

      If you maintain a Django-related package, I’d encourage you to at least think about adopting some form of DjangoVer, too. I won’t say it’s the best, period, because something better could always come along, but in terms of information that can be usefully encoded into the version number, I think DjangoVer is the best option I’ve seen for Django-related packages.

      November 18, 2024 02:04 AM UTC


      Armin Ronacher

      Playground Wisdom: Threads Beat Async/Await

      It's been a few years since I wrote about my challenges with async/await-based systems and how they just seem to not support back pressure well. A few years later, I do not think that this problem has subsided much, but my thinking and understanding have perhaps evolved a bit. I'm now convinced that async/await is, in fact, a bad abstraction for most languages, and we should be aiming for something better instead and that I believe to be thread.

      In this post, I'm also going to rehash many arguments from very clever people that came before me. Nothing here is new, I just hope to bring it to a new group of readers. In particular, you should really consider these who highly influential pieces:

      Your Child Loves Actor Frameworks

      As programmers, we are so used to how things work that we make some implicit assumptions that really cloud our ability to think freely. Let me present you with a piece of code that demonstrates this:

      def move_mouse():
          while mouse.x < 200:
              mouse.x += 5
              sleep(10)
      
      def move_cat():
          while cat.x < 200:
              cat.x += 10
              sleep(10)
      
      move_mouse()
      move_cat()
      

      Read that code and then answer this question: do the mouse and cat move at the same time, or one after another? I guarantee you that 10 out of 10 programmers will correctly state that they move one after another. It makes sense because we know Python and the concept of threads, scheduling and whatnot. But if you speak to a group of children familiar with Scratch, they are likely to conclude that mouse and cat move simultaneously.

      The reason is that if you are exposed to programming via Scratch you are exposed to a primitive form of actor programming. The cat and the mouse are both actors. In fact, the UI makes this pretty damn clear, just that the actors are called “sprites”. You attach logic to a sprite on the screen and all these pieces of logic run at the same time. Mind-blowing. You can even send messages from sprite to sprite.

      The reason I want you to think about this for a moment is that I think this is rather profound. Scratch is a very, very simple system and it's intended to teaching programming to young kids. Yet the model it promotes is an actor system! If you were to foray into programming via a traditional book on Python, C# or some other language, it's quite likely that you will only learn about threads at the very end. Not just that, it will likely make it sound really complex and scary. Worse, you will probably only learn about actor patterns in some advanced book that will bombard you with all the complexities of large scale applications.

      There is something else though you should keep in mind: Scratch will not talk about threads, it will not talk about monads, it will not talk about async/await, it will not talk about schedulers. As far as you are concerned as a programmer, it's an imperative (though colorful and visual) language with some basic “syntax” support for message passing. Concurrency comes natural. A child can program it. It's not something to be afraid of.

      Imperative Programming Is Not Inferior

      The second thing I want you to take away is that imperative languages are not inferior to functional ones.

      While probably most of us are using imperative programming languages to solve problems, I think we all have been exposed to the notion that it's inferior and not particularly pure. There is this world of functional programming, with monads and other things. This world have these nice things involving composition, logic and maths and fancy looking theorems. If you program in that, you're almost transcending to a higher plane and looking down to the folks who are stitching together if statements, for loops, make side effects everywhere, and are doing highly inappropriate things with IO.

      Okay, maybe it's not quite as bad, but I don't think I'm completely wrong with those vibes. And look, I get it. I feel happy chaining together lambdas in Rust and JavaScript. But we should also be aware that these constructs are, in many languages, bolted on. Go, for instance, gets away without most of this, and that does not make it an inferior language!

      So what you should keep in mind here is that there are different paradigms, and mentally you should try to stop thinking for a moment that functional programming has all its stuff figured out, and imperative programming does not.

      Instead, I want to talk about how functional languages and imperative languages are dealing with “waiting”.

      The first thing I want to back to is the example from above. Both of the functions (for the cat and the mouse) can be seen as separate threads of execution. When the code calls sleep(10) there's clearly an expectation by the programmer that the computer will temporarily pause the execution and continue later. I don't want to bore you with monads, so as my “functional” programming language, I will use JavaScript and promises. I think that's an abstraction that most readers will be sufficiently familiar with:

      function moveMouseBlocking() {
        while (mouse.x < 200) {
          mouse.x += 5;
          sleep(10);  // a blocking sleep
        }
      }
      
      function moveMouseAsync() {
        return new Promise((resolve) => {
          function iterate() {
            if (mouse.x < 200) {
              mouse.x += 5;
              sleep(10).then(iterate);  // non blocking sleep
            } else {
              resolve();
            }
          }
          iterate();
        });
      }
      

      You can immediately see a challenge here: it's very hard to translate the blocking example into a non blocking example because all the sudden we need to find a way to express our loop (or really any control flow). We need to manually decompose it into a form of recursive function calling and we need the help of a scheduler and executor here to do the waiting.

      This style obviously eventually became annoying enough to deal with that async/await was introduced to mostly restore the sanity of the old code. So it now can look more like this:

      async function moveMouseAsync() {
        while (mouse.x < 200) {
          mouse.x += 5;
          await sleep(10);
        }
      }
      

      Behind the scenes though, nothing has really changed, and in particular, when you call that function, you just get an object that encompasses the “composition of the computation”. That object is a promise which will eventually hold the resulting value. In fact, in some languages like C#, the compiler will really just transpile this into chained function calls. With the promise in hand, you can await the result, or register a callback with then which gets invoked if this thing ever runs to completion.

      For a programmer, I think async/await is clearly understood as some sort of neat abstraction — an abstraction over promises and callbacks. However strictly speaking, it's just worse than where we started out, because in terms of expressiveness, we have lost an important affordance: we cannot freely suspend.

      In the original blocking code, when we invoked sleep we suspended for 10 milliseconds implicitly; we cannot do the same with the async call. Here we have to “await” the sleep operation. This is the crucial aspect of why we're having these “colored functions”. Only an async function can call another async function, as you cannot await in a sync function.

      Halting Problems

      The above example shows another problem that async/await causes: what if we never resolve? A normal function call eventually returns, the stack unwinds, and we're ready to receive the result. In an async world, someone has to call resolve at the very end. What if that is never called? Now in theory, that does not seem all that different from someone calling sleep() with a large number to suspend for a very long time, or waiting on a pipe that never gets data sent into. But it is different! In one case, we keep the call stack and everything that relates to it alive; in another case, we just have a promise and are waiting for independent garbage collection with everything already unwound.

      Contract wise, there is absolutely nothing that says one has to call resolve. As we know from theory the halting problem is undecidable so it's going to be actually impossible to know if someone will call resolve or not.

      That sounds pedantic, but it's very important because promises/futures and async/await are making something strictly worse than not having them. Let's consider a JavaScript promise to be the most canonical example of what this looks like. A promise is created by an anonymous function, that is invoked to eventually call resolve. Take this example:

      let neverSettle = new Promise((resolve) => {
        // this function ends, but we never called resolve
      });
      

      Let me clarify first that this is not a JavaScript specific problem, but it's nice to show it this way. This is a completely legal thing! It's a promise, that never resolves. That is not a bug! The anonymous function in the promise itself will return, the stack will unwind, and we are left with a “pending” promise that will eventually get garbage collected. That is a bit of a problem because since it will never resolve, you can also never await it.

      Think of the following example, which demonstrates this problem a bit. In practice you might want to reduce how many things can work at once, so let's imagine a system that can handle up to 10 things that run concurrently. So we might want to use a semaphore to give out 10 tokens so up to 10 things can run at once; otherwise, it applies back pressure. So the code looks like this:

      const semaphore = new Semaphore(10);
      
      async function execute(f) {
        let token = await semaphore.acquire();
        try {
          await f();
        } finally {
          await semaphore.release(token);
        }
      }
      

      But now we have a problem. What if the function passed to the execute function returns neverSettle? Well, clearly we will never release the semaphore token. This is strictly worse compared to blocking functions! The closest equivalent would be a stupid function that calls a very long running sleep. But it's different! In one case, we keep the call stack and everything that relates to it alive; in the other case case we just have a promise that will eventually get garbage collected, and we will never see it again. In the promise case, we have effectively decided that the stack is not useful.

      There are ways to fix this, like making promise finalization available so we can get informed if a promise gets garbage collected etc. However I want to point out that as per contract, what this promise is doing is completely acceptable and we have just caused a new problem, one that we did not have before.

      And if you think Python does not have that problem, it does too. Just await Future() and you will be waiting until the heat death of the universe (or really when you shut down your interpreter).

      The promise that sits there unresolved has no call stack. But that problem also comes back in other ways, even if you use it correctly. The decomposed functions calling functions via the scheduler flow means that now you need extra affordances to stitch these async calls together into full call stacks. This all creates extra problems that did not exist before. Call stacks are really, really important. They help with debugging and are also crucial for profiling.

      Blocking is an Abstraction

      Okay, so we know there is at least some challenge with the promise model. What other abstractions are there? I will make the argument that a function being able to “suspend” a thread of execution is a bloody great capability and abstraction. Think of it for a moment: no matter where I am, I can say I need to wait for something and continue later where I left off. This is particularly crucial to apply back-pressure if you decide to need it later. The biggest footgun in Python asyncio remains that write is non blocking. That function will stay problematic forever and you need to follow up with await s.drain() to avoid buffer bloat.

      In particular it's an important abstraction because in the real world we have constantly faced with things in fact not being async all the time, and some of the things we think might not block, will in fact block. Just like Python did not think that write should be able to block when it was designed. I want to give you a colorful example of this. Why is the following code blocking, and what is?

      def decode_object(idx):
          header = indexes[idx]
          object_buf = buffer[header.start:header.start + header.size]
          return brotli.decompress(object_buf)
      

      It's a bit of a trick question, but not really. The reason it's blocking is because memory access can be blocking! You might not think of it this way, but there are many reasons why just touching a memory region can take time. The most obvious one is memory-mapped files. If you're touching a page that hasn't been loaded yet, the operating system will have to shovel it into memory before returning back to you. There is no “await touching this memory” expression, because if there were, we would have to await everywhere. That might sound petty but blocking memory reads were at the source of a series of incidents at Sentry [1].

      The trade-off that async/await makes today is that the idea is that not everything needs to block or needs to suspend. The reality, however, has shown me that many more things really want to suspend, and if a random memory access is a case for suspending, then is the abstraction worth anything?

      So maybe to allow any function call block and suspend really was the right abstraction to begin with.

      But then we need to talk about spawning threads next, because a single thread is not worth much. The one affordance that async/await system gives you that you don't have otherwise, is actually telling two things to run concurrently. You get that by starting the async operation and deferring the awaiting to later. This is where I will have to concede that async/await has something going for it. It moves the reality of concurrent execution right into the language. The reason concurrency comes so natural to a Scratch programmer is that it's right there, so async/await solves a very similar purpose here.

      In a traditional imperative language based on threads, the act of spawning a thread is usually hidden behind a (often convoluted) standard library function. More annoyingly threads very much feel bolted on and completely inadequate to even to the most basic of operations. Because not only do we want to spawn threads, we want to join on them, we want to send values across thread boundaries (including errors!). We want to wait for either a task to be done, or a keyboard input, messages being passed etc.

      Classic Threading

      So lets focus on threads for a second. As said before, what we are looking for is the ability for any function to yield / suspend. That's what threads allow us to do!

      When I am talking about “threads” here, I'm not necessarily referring to a specific kind of implementation of threads. Think of the example of promises from above for a moment: we had the concept of “sleeping”, but we did not really say how that is implemented. There is clearly some underlying scheduler that can enable that, but how that takes places is outside the scope of the language. Threads can be like that. They could be real OS threads, they could be virtual and be implemented with fibers or coroutines. At the end of the day, we don't necessarily have to care about it as developer if the language gets it right.

      The reason this matters is that when I talk about “suspending” or “continuing somewhere else,” immediately the thought of coroutines and fibers come to mind. That's because many languages that support them give you those capabilities. But it's good to step back for a second and just think about general affordances that we want, and not how they are implemented.

      We need a way to say: run this concurrently, but don't wait for it to return, we want to wait later (or never!). Basically, the equivalent in some languages to call an async function, but to not await. In other words: to schedule a function call. And that is, in essence, just what spawning a thread is. If we think about Scratch: one of the reasons concurrency comes natural there is because it's really well integrated, and a core affordance of the language. There is a real programming language that works very much the same: go with its goroutines. There is syntax for it!

      So now we can spawn, and that thing runs. But now we have more problems to solve: synchronization, waiting, message passing and all that jazz are not solved. Even Scratch has answers to that! So clearly there is something else missing to make this work. And what even does that spawn call return?

      A Detour: What is Async Even

      There is an irony in async/await and that irony is that it exists in multiple languages, it looks completely the same on the surface, but works completely different under the hood. Not only that, the origin stories of async/await in different languages are not even the same.

      I mentioned earlier that code that can arbitrary block is an abstraction of sorts. That abstraction for many applications really only makes sense is if the CPU time while you're blocking can be used in other useful ways. On the one hand, because the computer would be pretty bored if it was only doing things in sequence, on the other hand, because we might need things to run in parallel. At times as programmers we need to do two things to make progress simultaneously before we can continue. Enter creating more threads. But if threads are so great, why all that talking about coroutines and promises that underpins so much of async/await in different languages?

      I think this is the point where the story actually becomes confusing quickly. For instance JavaScript has entirely different challenges than Python, C# or Rust. Yet somehow all those languages ended up with a form of async/await.

      Let's start with JavaScript. JavaScript is a single threaded language where a function scope cannot yield. There is no affordance in the language to do that and threads do not exist. So before async/await, the best you could do is different forms of callback hell. The first iteration of improving that experience was adding promises. async/await only became sugar for that afterward. The reason that JavaScript did not have much choice here is that promises was the only thing that could be accomplished without language changes, and async/await is something that can be implemented as a transpilation step. So really; there are no threads in JavaScript. But here is an interesting thing that happens: JavaScript on the language level has the concept of concurrency. If you call setTimeout, you tell the runtime to schedule a function to be called later. This is crucial! In particular it also means that a promise created, will be scheduled automatically. Even if you forget about it, it will run!

      Python on the other hand had a completely different origin story. In the days before async/await, Python already had threads — real, operating system level threads. What it did not have however was the ability for multiple of those threads to run in parallel. The reason for this obviously the GIL (Global Interpreter Lock). However that “just” makes things not to scale to more than one core, so let's ignore that for a second. Because it had threads, it also rather early had people experiment with implementing virtual threads in Python. Back in the day (and to some extend today) the cost of an OS level thread was pretty high, so virtual threads were seen as a fast way to spawn more of these concurrent things. There were two ways in which Python got virtual threads. One was the Stackless Python project, which was an alternative implementation of Python (many patches for cpython rather) that implemented what's called a “stackless VM” (basically a VM that does not maintain a C stack). In short, what that enabled is implementing something that stackless called “tasklets” which were functions that could be suspended and resumed. Stackless did not have a bright future because the stackless nature meant that you could not have interleaving Python -> C -> Python calls and suspend with them on the stack.

      There was a second attempt in Python called “greenlet”. The way greenlet worked was implementing coroutines in a custom extension module. It is pretty gnarly in its implementation, but it does allow for cooperative multi tasking. However, like stackless, that did not win out. Instead, what actually happened is that the generator system that Python had for years was gradually upgraded into a coroutine system with syntax support, and the async system was built on top of that.

      One of the consequences of this is that it requires syntax support to suspend from a coroutine. This meant that you cannot implement a function like sleep that, when called, yields to a scheduler. You need to await it (or in earlier times you could use yield from). So we ended up with async/await because of how coroutines work in Python under the hood. The motivation for this was that it was seen as a positive thing that you know when something suspends.

      One interesting consequence of the Python coroutine model is that at least on the coroutine model it can transcend OS level threads. I could make a coroutine on one thread, ship it off to another, and continue it there. In practice, that does not work because once hooked up with the IO system, it cannot travel to another event loop on another thread any more. But you can already see that fundamentally it does something quite different to JavaScript. It can travel between threads at least in theory; there are threads; there is syntax to yield. A coroutine in Python will also start out with not running, unlike in JavaScript where it's effectively always scheduled. This is also in parts because the scheduler in python can be swapped out, and there are competing and incompatible implementations.

      Lastly let's talk about C#. Here the origin story is once again entirely different. C# has real threads. Not only does it have real threads, it also has per-object locks and absolutely no problems with dealing with multiple threads running in parallel. But that does not mean that it does not have other issues. The reality is that threads alone are just not enough. You need to synchronize and talk between threads quite often and sometimes you just need to wait. For instance you need to wait for user input. You still want to do something, while you're stuck there processing that input. So over time .NET introduced “tasks” which are an abstraction over async operations. They are part of the .NET threading system and the way you interact with them is that you write your code in there, you can suspend from tasks with syntax. .NET will run the task on the current thread, and if you do some blocking you stay blocked. This is in that sense, quite different from JavaScript where while no new “thread” is created, you pend the execution in the scheduler. The reason it works this way in .NET is that some of the motivation of this system was to allow UI triggered code to access the main UI thread without blocking it. But the consequence again is, that if you block for real, you just screwed something up. That however is also why at least at one point what C# did was just to splice functions into chained closures whenever it hit an await. It just decomposes one logical piece of code into many separate functions.

      I really don't want to go into Rust, but Rust's async system is probably the weirdest of them all because it's polling-based. In short: unless you actively “wait” for a task to complete, it will not make progress. So the purpose of a scheduler there is to make sure that a task actually can make progress. Why did rust end up with async/await? Primarily because they wanted something that works without a runtime and a scheduler and the limitations of the borrow checker and memory model.

      Of all those languages, I think the argument for async/await is the strongest for Rust and JavaScript. Rust because it's a systems language and they wanted a design that works with a limited runtime. JavaScript to me also makes sense because the language does not have real threads, so the only alternative to async/await is callbacks. But for C# the argument seems much weaker. Even the problem of having to force code to run on the UI thread could be just used by having a scheduling policy for virtual threads. The worst offender here in my mind is Python. async/await has ended up with a really complex system where the language now has coroutines and real threads, different synchronization primitives for each and async tasks that end up being pinned to one OS thread. The language even has different futures in the standard library for threads and async tasks!

      The reason I wanted you to understand all this is that all these different languages share the same syntax, yet what you can do with it is completely different. What they all have in common is that async functions can only be called by async functions (or the scheduler).

      What Async Isn't

      Over the years I heard a lot of arguments about why for instance Python ended up with async/await and some of the arguments presented don't hold up to scrutiny from my perspective. One argument that I have heard repeatedly is that if you control when you suspend, you don't need to deal with locking or synchronization. While there is some truth to that (you don't randomly suspend), you still end up with having to lock. There is still concurrency so you need to still protect all your stuff. In Python in particular this is particularly frustrating because not only do you have colored functions, you also have colored locks. There are locks for threads and there are locks for async code, and they are different.

      There is a very good reason why I showed the example above of the semaphore: semaphores are real in async programming. They are very often needed to protect a system from taking on too much work. In fact, one of the core challenges that many async/await-based programs suffer from is bloating buffers because there is an inability to exert back pressure (I once again point you to my post on that). Why can they not? Because unless an API is async, it is forced to buffer or fail. What it cannot do, is block.

      Async also does not magically solve the issues with GIL in Python. It does not magically make real threads appear in JavaScript, it does not solve issues when random code starts blocking (and remember, even memory access can block). Or you very slowly calculate a large Fibonacci number.

      Threads are the Answer, Not Coroutines

      I already alluded to this above a few times, but when we think about being able to “suspend” from an arbitrary point in time, we often immediately think of coroutines as a programmers. For good reasons: coroutines are amazing, they are fun, and every programming language should have them!

      Coroutines are an important building block, and if any future language designer is looking at this post: you should put them in.

      But coroutines should be very lightweight, and they can be abused in ways that make it very hard to follow what's going on. Lua, for instance, gives you coroutines, but it does not give you the necessary structure to do something with them easily. You will end up building your own scheduler, your own threading system, etc.

      So what we really want is where we started out with: threads! Good old threads!

      The irony in all of this is, that the language that I think actually go this right is modern Java. Project Loom in Java has coroutines and all the bells and whistles under the hood, but what it exposes to the developer is good old threads. There are virtual threads, which are mounted on carrier OS threads, and these virtual threads can travel from thread to thread. If you end up issuing a blocking call on a virtual thread, it yields to the scheduler.

      Now I happen to think that threads alone are not good enough! Threads require synchronization, they require communication primitives etc. Scratch has message passing! So there is more that needs to be built to make them work well.

      I want to follow up on an another blog post about what is needed to make threads easier to work with. Because what async/await clearly innovated is bringing some of these core capabilities closer to the user of the language, and often modern async/await code looks easier to read than traditional code using threads is.

      Structured Concurrency and Channels

      Lastly I do want to say something nice about async/await and celebrate the innovations that it has brought up. I believe that this language feature singlehandedly drove some crucial innovation about concurrent programming by making it widely accessible. In particular it moved many developers from a basic “single thread per request” model to breaking down tasks into smaller chunks, even in languages like Python. For me, the biggest innovation here goes to Trio, which introduced the concept of structured concurrency via its nursery. That concept has eventually found a home even in asyncio with the concept of the TaskGroup API and is finding its way into Java.

      I recommend you to read Nathaniel J. Smith's Notes on structured concurrency, or: Go statement considered harmful for a much better introduction. However if you are unfamiliar with it, here is my attempt of explaining it:

      • There is a clear start and end of work: every thread or task has a clear beginning and end, which makes it easier to follow what each thread is doing. All threads spawned in the context of a thread, are known to that thread. Think of it like creating a small team to work on a task: they start together, finish together, and then report back.
      • Threads don't outlive their parent: if for whatever reason the parent is done before the children threads, it automatically awaits before returning.
      • Error propagate and cause cancellations: If something goes wrong in one thread, the error is passed back to the parent. But more importantly, it also automatically causes other child threads to cancel. Cancellations are a core of the system!

      I believe that structured concurrrency needs to become a thing in a threaded world. Threads must know their parents and children. Threads also need fo find convenient ways to ways to pass their success values back. Lastly context should flow from thread to thread implicity through context locals.

      The second part is that async/await made it much more apparent that tasks / threads need to talk with each other. In particular the concept of channels and selecting on channels became more prevalent. This is an essential building block which I think can be further improved upon. As food for thought: if you have structured concurrency, in principle each thread's return value really can be represented as a buffered channel attached to the thread, holding up to a single value (successful return value or error) that you can select on.

      Today, although no language has perfected this model, thanks to many years of experimentation, the solution seems clearer than ever, with structured concurrency at its core.

      Conclusion

      I hope I was able to demonstrate to you that async/await has been a mixed bag. It brought some relief from callback hell, but it also saddled us with new issues like colored functions, new back-pressure challenges, and introduced new problems all entirely such as promises that can just sit around forever without resolving. It has also taken away a lot of utility that call stacks brought, in particular for debugging and profiling. These aren't minor hiccups; they're real obstacles that get in the way of the straightforward, intuitive concurrency we should be aiming for.

      If we take a step back, it seems pretty clear to me that we have veered off course by adopting async/await in languages that have real threads. Innovations like Java's Project Loom feel like the right fit here. Virtual threads can yield when they need to, switch contexts when blocked, and even work with message-passing systems that make concurrency feel natural. If we free ourselves from the idea that the functional, promise system has figured out all the problems we can look at threads properly again.

      However at the same time async/await has moved concurrent programming to the forefront and has resulted in real innovation. Making concurrency a core feature of the language (via syntax even!) is a good thing. Maybe the increased adoption and people struggling with it, was what made structured concurrency a real thing in the Python async/await world.

      Future language design should rethink concurrency once more: Instead of adopting async/await, new languages should model themselves more like Java's Project Loom but with more user friendly primitives. But like Scratch, it should give programmers really good APIs that make concurrency natural. I don't think actor frameworks are the right fit, but a combination of structured concurrency, channels, syntax support for spawning/joining/selecting will go a long way. Watch this space for a future blog post about some things I found to work better than others.

      [1]Sentry works with large debug information files such as PDB or DWARF. These files can be gigabytes in size and we memory map terabytes of preprocessed files into memory during processing. Memory mapped files can block is hardly a surprise, but what we learned in the process is that thanks to containerization and memory limits, you can easily navigate yourself into a situation where you spend much more time on page faults than you expected and the system crawls to a halt.

      November 18, 2024 12:00 AM UTC

      November 17, 2024


      Django Weblog

      2025 DSF Board Election Results

      The 2025 DSF Board Election has closed, and the following candidates have been elected:

      They will all serve two years for their term.

      Directors elected for the 2024 DSF Board, Jacob, Sarah, and Thibaud are continuing with one year left to serve on the board.

      Therefore, the combined 2025 DSF Board of Directors are:

      Congratulations to our winners, and a huge thank you to our departing board members Çağıl Uluşahin Sonmez, Chaim Kirby, Kátia Yoshime Nakamura, Katie McLaughlin.

      Thank you again to everyone who nominated themselves. Even if you were not successful, you gave our community the chance to make their voices heard in who they wanted to represent them.

      November 17, 2024 11:56 PM UTC


      Paolo Melchiorre

      Thoughts on my election as a DSF board member

      My thoughts on my election as a member of the Django Software Foundation (DSF) board of directors.

      November 17, 2024 11:00 PM UTC


      Real Python

      Using the Python zip() Function for Parallel Iteration

      Python’s zip() function combines elements from multiple iterables. Calling zip() generates an iterator that yields tuples, each containing elements from the input iterables. This function is essential for tasks like parallel iteration and dictionary creation, offering an efficient way to handle multiple sequences in Python programming.

      By the end of this tutorial, you’ll understand that:

      • zip() in Python aggregates elements from multiple iterables into tuples, facilitating parallel iteration.
      • dict(zip()) creates dictionaries by pairing keys and values from two sequences.
      • zip() is lazy in Python, meaning it returns an iterator instead of a list.
      • There’s no unzip() function in Python, but the same zip() function can reverse the process using the unpacking operator *.
      • Alternatives to zip() include itertools.zip_longest() for handling iterables of unequal lengths.

      In this tutorial, you’ll explore how to use zip() for parallel iteration. You’ll also learn how to handle iterables of unequal lengths and discover the convenience of using zip() with dictionaries. Whether you’re working with lists, tuples, or other data structures, understanding zip() will enhance your coding skills and streamline your Python projects.

      Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.

      Understanding the Python zip() Function

      zip() is available in the built-in namespace. If you use dir() to inspect __builtins__, then you’ll see zip() at the end of the list:

      Python
      >>> dir(__builtins__)
      ['ArithmeticError', 'AssertionError', 'AttributeError', ..., 'zip']
      
      Copied!

      You can see that 'zip' is the last entry in the list of available objects.

      According to the official documentation, Python’s zip() function behaves as follows:

      Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The iterator stops when the shortest input iterable is exhausted. With a single iterable argument, it returns an iterator of 1-tuples. With no arguments, it returns an empty iterator. (Source)

      You’ll unpack this definition throughout the rest of the tutorial. As you work through the code examples, you’ll see that Python zip operations work just like the physical zipper on a bag or pair of jeans. Interlocking pairs of teeth on both sides of the zipper are pulled together to close an opening. In fact, this visual analogy is perfect for understanding zip(), since the function was named after physical zippers!

      Using zip() in Python

      The signature of Python’s zip() function is zip(*iterables, strict=False). You’ll learn more about strict later. The function takes in iterables as arguments and returns an iterator. This iterator generates a series of tuples containing elements from each iterable. zip() can accept any type of iterable, such as files, lists, tuples, dictionaries, sets, and so on.

      Passing n Arguments

      If you use zip() with n arguments, then the function will return an iterator that generates tuples of length n. To see this in action, take a look at the following code block:

      Python
      >>> numbers = [1, 2, 3]
      >>> letters = ["a", "b", "c"]
      >>> zipped = zip(numbers, letters)
      >>> zipped  # Holds an iterator object
      <zip object at 0x7fa4831153c8>
      
      >>> type(zipped)
      <class 'zip'>
      
      >>> list(zipped)
      [(1, 'a'), (2, 'b'), (3, 'c')]
      
      Copied!

      Here, you use zip(numbers, letters) to create an iterator that produces tuples of the form (x, y). In this case, the x values are taken from numbers and the y values are taken from letters. Notice how the Python zip() function returns an iterator. To retrieve the final list object, you need to use list() to consume the iterator.

      If you’re working with sequences like lists, tuples, or strings, then your iterables are guaranteed to be evaluated from left to right. This means that the resulting list of tuples will take the form [(numbers[0], letters[0]), (numbers[1], letters[1]),..., (numbers[n], letters[n])]. However, for other types of iterables (like sets), you might see some weird results:

      Python
      >>> s1 = {2, 3, 1}
      >>> s2 = {"b", "a", "c"}
      >>> list(zip(s1, s2))
      [(1, 'a'), (2, 'c'), (3, 'b')]
      
      Copied!

      In this example, s1 and s2 are set objects, which don’t keep their elements in any particular order. This means that the tuples returned by zip() will have elements that are paired up randomly. If you’re going to use the Python zip() function with unordered iterables like sets, then this is something to keep in mind.

      Passing No Arguments

      You can call zip() with no arguments as well. In this case, you’ll simply get an empty iterator:

      Python
      >>> zipped = zip()
      >>> zipped
      <zip object at 0x7f196294a488>
      
      >>> list(zipped)
      []
      
      Copied!

      Here, you call zip() with no arguments, so your zipped variable holds an empty iterator. If you consume the iterator with list(), then you’ll see an empty list as well.

      Read the full article at https://realpython.com/python-zip-function/ »


      [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

      November 17, 2024 02:00 PM UTC


      Test and Code

      223: Writing Stuff Down is a Super Power

      Taking notes well can help to listen better, remember things, show respect, be more accountable, free up mind space to solve problems.

      This episode discusses


       Learn pytest

      <p>Taking notes well can help to listen better, remember things, show respect, be more accountable, free up mind space to solve problems.</p><p>This episode discusses</p><ul><li>the benefits of writing things down</li><li>preparing for a meeting</li><li>taking notes in meetings</li><li>reviewing notes for action items, todo items, things to follow up on, etc.</li><li>taking notes to allow for better focus</li><li>writing well structured emails</li><li>writing blog posts and books</li></ul> <br><p><strong> Learn pytest</strong></p><ul><li>pytest is the number one test framework for Python.</li><li>Learn the basics super fast with <a href="https://courses.pythontest.com/hello-pytest">Hello, pytest!</a></li><li>Then later you can become a pytest expert with <a href="https://courses.pythontest.com/the-complete-pytest-course">The Complete pytest Course</a></li><li>Both courses are at <a href="https://courses.pythontest.com/">courses.pythontest.com</a></li></ul>

      November 17, 2024 01:55 AM UTC

      November 16, 2024


      Real Python

      Using the len() Function in Python

      The len() function in Python is a powerful and efficient tool used to determine the number of items in objects, such as sequences or collections. You can use len() with various data types, including strings, lists, dictionaries, and third-party types like NumPy arrays and pandas DataFrames. Understanding how len() works with different data types helps you write more efficient and concise Python code.

      Using len() in Python is straightforward for built-in types, but you can extend it to your custom classes by implementing the .__len__() method. This allows you to customize what length means for your objects. For example, with pandas DataFrames, len() returns the number of rows. Mastering len() not only enhances your grasp of Python’s data structures but also empowers you to craft more robust and adaptable programs.

      By the end of this tutorial, you’ll understand that:

      • The len() function in Python returns the number of items in an object, such as strings, lists, or dictionaries.
      • To get the length of a string in Python, you use len() with the string as an argument, like len("example").
      • To find the length of a list in Python, you pass the list to len(), like len([1, 2, 3]).
      • The len() function operates in constant time, O(1), as it accesses a length attribute in most cases.

      In this tutorial, you’ll learn when to use the len() Python function and how to use it effectively. You’ll discover which built-in data types are valid arguments for len() and which ones you can’t use. You’ll also learn how to use len() with third-party types like ndarray in NumPy and DataFrame in pandas, and with your own classes.

      Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

      Getting Started With Python’s len()

      The function len() is one of Python’s built-in functions. It returns the length of an object. For example, it can return the number of items in a list. You can use the function with many different data types. However, not all data types are valid arguments for len().

      You can start by looking at the help for this function:

      Python
      >>> help(len)
      Help on built-in function len in module builtins:
      len(obj, /)
          Return the number of items in a container.
      
      Copied!

      The function takes an object as an argument and returns the length of that object. The documentation for len() goes a bit further:

      Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set). (Source)

      When you use built-in data types and many third-party types with len(), the function doesn’t need to iterate through the data structure. The length of a container object is stored as an attribute of the object. The value of this attribute is modified each time items are added to or removed from the data structure, and len() returns the value of the length attribute. This ensures that len() works efficiently.

      In the following sections, you’ll learn about how to use len() with sequences and collections. You’ll also learn about some data types that you cannot use as arguments for the len() Python function.

      Using len() With Built-in Sequences

      A sequence is a container with ordered items. Lists, tuples, and strings are three of the basic built-in sequences in Python. You can find the length of a sequence by calling len():

      Python
      >>> greeting = "Good Day!"
      >>> len(greeting)
      9
      
      >>> office_days = ["Tuesday", "Thursday", "Friday"]
      >>> len(office_days)
      3
      
      >>> london_coordinates = (51.50722, -0.1275)
      >>> len(london_coordinates)
      2
      
      Copied!

      When finding the length of the string greeting, the list office_days, and the tuple london_coordinates, you use len() in the same manner. All three data types are valid arguments for len().

      The function len() always returns an integer as it’s counting the number of items in the object that you pass to it. The function returns 0 if the argument is an empty sequence:

      Python
      >>> len("")
      0
      >>> len([])
      0
      >>> len(())
      0
      
      Copied!

      In the examples above, you find the length of an empty string, an empty list, and an empty tuple. The function returns 0 in each case.

      A range object is also a sequence that you can create using range(). A range object doesn’t store all the values but generates them when they’re needed. However, you can still find the length of a range object using len():

      Python
      >>> len(range(1, 20, 2))
      10
      
      Copied!

      This range of numbers includes the integers from 1 to 19 with increments of 2. The length of a range object can be determined from the start, stop, and step values.

      In this section, you’ve used the len() Python function with strings, lists, tuples, and range objects. However, you can also use the function with any other built-in sequence.

      Read the full article at https://realpython.com/len-python-function/ »


      [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

      November 16, 2024 02:00 PM UTC

      November 15, 2024


      Real Python

      The Real Python Podcast – Episode #228: Maintaining the Foundations of Python & Cautionary Tales

      How do you build a sustainable open-source project and community? What lessons can be learned from Python's history and the current mess that the WordPress community is going through? This week on the show, we speak with Paul Everitt from JetBrains about navigating open-source funding and the start of the Python Software Foundation.


      [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

      November 15, 2024 12:00 PM UTC