skip to navigation
skip to content

Planet Python

Last update: July 15, 2024 10:44 AM UTC

July 15, 2024

Kushal Das

Disable this Firefox preference to save privacy

If you are on the latest Firefox 128 (which is there on Fedora 40), you should uncheck the following preference to disable Privacy-Preserving Attribution. Firefox added this experimental feature and turn it on by default for everyone. Which should not be the case.

Uncheck the box

You can find it in the preferences window.

July 15, 2024 08:55 AM UTC

Zato Blog

Network packet brokers and automation in Python

Network packet brokers and automation in Python

What is a network packet broker and what does it do?

Packet brokers are crucial for network engineers, providing a clear, detailed view of network traffic, aiding in efficient issue identification and resolution.

But what is a network packet broker (NBP) really? Why are they needed? And how to automate one in Python?

➤ Read this article about network packet brokers and their automation in Python to find out more.

More resources

Click here to read more about using Python and Zato in telecommunications
➤ Python API integration tutorial
What is an integration platform?

July 15, 2024 04:43 AM UTC

July 14, 2024

Real Python

Quiz: Python Type Checking

In this quiz, you’ll test your understanding of Python Type Checking.

By working through this quiz, you’ll revisit type annotations and type hints, adding static types to code, running a static type checker, and enforcing types at runtime.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 14, 2024 12:00 PM UTC

July 12, 2024


Finding Simple Rewrite Rules for the JIT with Z3

In June I was at the PLDI conference in Copenhagen to present a paper I co-authored with Max Bernstein. I also finally met John Regehr, who I'd been talking on social media for ages but had never met. John has been working on compiler correctness and better techniques for building compilers and optimizers since a very long time. The blog post Finding JIT Optimizer Bugs using SMT Solvers and Fuzzing was heavily inspired by this work. We talked a lot about his and his groups work on using Z3 for superoptimization and for finding missing optimizations. I have applied some of the things John told me about to the traces of PyPy's JIT, and wanted to blog about that. However, my draft felt quite hard to understand. Therefore I have now written this current post, to at least try to provide a somewhat gentler on-ramp to the topic.

In this post we will use the Python-API to Z3 to find local peephole rewrite rules for the operations in the intermediate representation of PyPy's tracing JIT. The code for this is simple enough that we can go through all of it.

The PyPy JIT produces traces of machine level instructions, which are optimized and then turned into machine code. The optimizer uses a number of approaches to make the traces more efficient. For integer operations it applies a number of arithmetic simplification rules rules, for example int_add(x, 0) -> x. When implementing these rules in the JIT there are two problems: How do we know that the rules are correct? And how do we know that we haven't forgotten any rules? We'll try to answer both of these, but the first one in particular.

We'll be using Z3, a satisfiability module theories (SMT) solver which has good bitvector support and most importantly an excellent Python API. We can use the solver to reason about bitvectors, which are how we will model machine integers.

To find rewrite rules, we will consider the binary operations (i.e. those taking two arguments) in PyPy traces that take and produce integers. The completely general form op(x, y) is not simplifiable on its own. But if either x == y or if one of the arguments is a constant, we can potentially simplify the operation into a simpler form. The results are either the variable x, or a (potentially different) constant. We'll ignore constant-folding where both arguments of the binary operation are constants. The possible results for a simplifiable binary operation are the variable x or another constant. This leaves the following patterns as possibilities:

Our approach will be to take every single supported binary integer operation, instantiate all of these patterns, and try to ask Z3 whether the resulting simplification is valid for all values of x.

Quick intro to the Z3 Python-API

Here's a terminal session showing the use of the Z3 Python API:

>>>> import z3
>>>> # construct a Z3 bitvector variable of width 8, with name x:
>>>> x = z3.BitVec('x', 8)
>>>> # construct a more complicated formula by using operator overloading:
>>>> x + x
x + x
>>>> x + 1
x + 1

Z3 checks the "satisfiability" of a formula. This means that it tries to find an example set of concrete values for the variables that occur in a formula, such that the formula becomes true. Examples:

>>>> solver = z3.Solver()
>>>> solver.check(x * x == 3)
>>>> # meaning no x fulfils this property
>>>> solver.check(x * x == 9)
>>>> model = solver.model()
>>>> model
[x = 253]
>>>> model[x].as_signed_long()
>>>> # 253 is the same as -3 in two's complement arithmetic with 8 bits

In order to use Z3 to prove something, we can ask Z3 to find counterexamples for the statement, meaning concrete values that would make the negation of the statement true:

>>>> solver.check(z3.Not(x ^ -1 == ~x))

The result unsat means that we just proved that x ^ -1 == ~x is true for all x, because there is no value for x that makes not (x ^ -1 == ~x) true (this works because -1 has all the bits set).

If we try to prove something incorrect in this way, the following happens:

>>>> solver.check(z3.Not(x ^ -1 == x))

sat shows that x ^ -1 == x is (unsurprisingly) not always true, and we can ask for a counterexample:

>>>> solver.model()
[x = 0]

This way of proving this works because the check calls try to solve an (implicit) "exists" quantifier, over all the Z3 variables used in the formula. check will either return z3.unsat, which means that no concrete values make the formula true; or z3.sat, which means that you can get some concrete values that make the formula true by calling solver.model().

In math terms we prove things using check by de-Morgan's rules for quantifiers:

$$ \lnot \exists x: \lnot f(x) \implies \forall x: f(x) $$

Now that we've seen the basics of using the Z3 API on a few small examples, we'll use it in a bigger program.

Encoding the integer operations of RPython's JIT into Z3 formulas

Now we'll use the API to reason about the integer operations of the PyPy JIT intermediate representation (IR). The binary integer operations are:

opnames2 = [

There's not much special about the integer operations. Like in LLVM, most of them are signedness-independent: int_add, int_sub, int_mul, ... work correctly for unsigned integers but also for two's-complement signed integers. Exceptions for that are order comparisons like int_lt etc. for which we have unsigned variants uint_lt etc. All operations that produce a boolean result return a full-width integer 0 or 1 (the PyPy JIT supports only word-sized integers in its intermediate representation)

In order to reason about the IR operations, some ground work:

import z3

solver = z3.Solver()
solver.set("timeout", 10000) # milliseconds, ie 10s
xvar = z3.BitVec('x', INTEGER_WIDTH)
constvar = z3.BitVec('const', INTEGER_WIDTH)
constvar2 = z3.BitVec('const2', INTEGER_WIDTH)

And here's the a function to turn an integer IR operation of PyPy's JIT into Z3 formulas:

def z3_expression(opname, arg0, arg1=None):
    """ computes a tuple of (result, valid_if) of Z3 formulas. `result` is the
    formula representing the result of the operation, given argument formulas
    arg0 and arg1. `valid_if` is a pre-condition that must be true for the
    result to be meaningful. """
    result = None
    valid_if = True # the precondition is mostly True, with few exceptions
    if opname == "int_add":
        result = arg0 + arg1
    elif opname == "int_sub":
        result = arg0 - arg1
    elif opname == "int_mul":
        result = arg0 * arg1
    elif opname == "int_and":
        result = arg0 & arg1
    elif opname == "int_or":
        result = arg0 | arg1
    elif opname == "int_xor":
        result = arg0 ^ arg1
    elif opname == "int_eq":
        result = cond(arg0 == arg1)
    elif opname == "int_ne":
        result = cond(arg0 != arg1)
    elif opname == "int_lt":
        result = cond(arg0 < arg1)
    elif opname == "int_le":
        result = cond(arg0 <= arg1)
    elif opname == "int_gt":
        result = cond(arg0 > arg1)
    elif opname == "int_ge":
        result = cond(arg0 >= arg1)
    elif opname == "uint_lt":
        result = cond(z3.ULT(arg0, arg1))
    elif opname == "uint_le":
        result = cond(z3.ULE(arg0, arg1))
    elif opname == "uint_gt":
        result = cond(z3.UGT(arg0, arg1))
    elif opname == "uint_ge":
        result = cond(z3.UGE(arg0, arg1))
    elif opname == "int_lshift":
        result = arg0 << arg1
        valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH)
    elif opname == "int_rshift":
        result = arg0 << arg1
        valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH)
    elif opname == "uint_rshift":
        result = z3.LShR(arg0, arg1)
        valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH)
    elif opname == "uint_mul_high":
        # zero-extend args to 2*INTEGER_WIDTH bit, then multiply and extract
        # highest INTEGER_WIDTH bits
        zarg0 = z3.ZeroExt(INTEGER_WIDTH, arg0)
        zarg1 = z3.ZeroExt(INTEGER_WIDTH, arg1)
        result = z3.Extract(INTEGER_WIDTH * 2 - 1, INTEGER_WIDTH, zarg0 * zarg1)
    elif opname == "int_pydiv":
        valid_if = arg1 != 0
        r = arg0 / arg1
        psubx = r * arg1 - arg0
        result = r + (z3.If(arg1 < 0, psubx, -psubx) >> (INTEGER_WIDTH - 1))
    elif opname == "int_pymod":
        valid_if = arg1 != 0
        r = arg0 % arg1
        result = r + (arg1 & z3.If(arg1 < 0, -r, r) >> (INTEGER_WIDTH - 1))
    elif opname == "int_is_true":
        result = cond(arg0 != FALSEBV)
    elif opname == "int_is_zero":
        result = cond(arg0 == FALSEBV)
    elif opname == "int_neg":
        result = -arg0
    elif opname == "int_invert":
        result = ~arg0
        assert 0, "unknown operation " + opname
    return result, valid_if

def cond(z3expr):
    """ helper function to turn a Z3 boolean result z3expr into a 1 or 0
    bitvector, using z3.If """
    return z3.If(z3expr, TRUEBV, FALSEBV)

We map the semantics of a PyPy JIT operation to Z3 with the z3_expression function. It takes the name of a JIT operation and its two (or one) arguments into a pair of Z3 formulas, result and valid_if. The resulting formulas are constructed with the operator overloading of Z3 variables/formulas.

The first element result of the result of z3_expression represents the result of performing the operation. valid_if is a bool that represents a condition that needs to be True in order for the result of the operation to be defined. E.g. int_pydiv(a, b) is only valid if b != 0. Most operations are always valid, so they return True as that condition (we'll ignore valid_if for a bit, but it will become more relevant further down in the post).

We can define a helper function to prove things by finding counterexamples:

def prove(cond):
    """ Try to prove a condition cond by searching for counterexamples of its negation. """
    z3res = solver.check(z3.Not(cond))
    if z3res == z3.unsat:
        return True
    elif z3res == z3.unknown: # eg on timeout
        return False
    elif z3res == z3.sat:
        return False
    assert 0, "should be unreachable"

Finding rewrite rules

Now we can start finding our first rewrite rules, following the first pattern op(x, x) -> x. We do this by iterating over all the supported binary operation names, getting the z3 expression for op(x, x) and then asking Z3 to prove op(x, x) == x.

for opname in opnames2:
    result, valid_if = z3_expression(opname, xvar, xvar)
    if prove(result == xvar):
        print(f"{opname}(x, x) -> x, {result}")

This yields the simplifications:

int_and(x, x) -> x
int_or(x, x) -> x

Synthesizing constants

Supporting the next patterns is harder: op(x, x) == c1, op(x, c1) == x, and op(c1, x) == x. We don't know which constants to pick to try to get Z3 to prove the equality. We could iterate over common constants like 0, 1, MAXINT, etc, or even over all the 256 values for a bitvector of length 8. However, we will instead ask Z3 to find the constants for us too.

This can be done by using quantifiers, in this case z3.ForAll. The query we pose to Z3 is "does there exist a constant c1 such that for all x the following is true: op(x, c1) == x? Note that the constant c1 is not necessarily unique, there could be many of them. We generate several matching constant, and add that they must be different to the condition of the second and further queries.

We can express this in a helper function:

def find_constant(z3expr, number_of_results=5):
    condition = z3.ForAll(
    for i in range(number_of_results):
        checkres = solver.check(condition)
        if checkres == z3.sat:
            # if a solver check succeeds, we can ask for a model, which is
            # concrete values for the variables constvar
            model = solver.model()
            const = model[constvar].as_signed_long()
            yield const
            # make sure we don't generate the same constant again on the
            # next call
            condition = z3.And(constvar != const, condition)
            # no (more) constants found

We can use this new function for the three mentioned patterns:

# try to find constants for op(x, x) == c
for opname in opnames2:
    result, valid_if = z3_expression(opname, xvar, xvar)
    for const in find_constant(result == constvar):
        print(f"{opname}(x, x) -> {const}")
# try to find constants for op(x, c) == x and op(c, x) == x
for opname in opnames2:
    result, valid_if = z3_expression(opname, xvar, constvar)
    for const in find_constant(result == xvar):
        print(f"{opname}(x, {const}) -> x")
    result, valid_if = z3_expression(opname, constvar, xvar)
    for const in find_constant(result == xvar):
        print(f"{opname}({const}, x) -> x")
# this code is not quite correct, we'll correct it later

Together this yields the following new simplifications:

# careful, these are not all correct!
int_sub(x, x) -> 0
int_xor(x, x) -> 0
int_eq(x, x) -> 1
int_ne(x, x) -> 0
int_lt(x, x) -> 0
int_le(x, x) -> 1
int_gt(x, x) -> 0
int_ge(x, x) -> 1
uint_lt(x, x) -> 0
uint_le(x, x) -> 1
uint_gt(x, x) -> 0
uint_ge(x, x) -> 1
uint_rshift(x, x) -> 0
int_pymod(x, x) -> 0
int_add(x, 0) -> x
int_add(0, x) -> x
int_sub(x, 0) -> x
int_mul(x, 1) -> x
int_mul(1, x) -> x
int_and(x, -1) -> x
int_and(-1, x) -> x
int_or(x, 0) -> x
int_or(0, x) -> x
int_xor(x, 0) -> x
int_xor(0, x) -> x
int_lshift(x, 0) -> x
int_rshift(x, 0) -> x
uint_rshift(x, 0) -> x
int_pydiv(x, 1) -> x
int_pymod(x, 0) -> x

Most of these look good at first glance, but the last one reveals a problem: we've been ignoring the valid_if expression up to now. We can stop doing that by changing the code like this, which adds z3.And(valid_if, ...) to the argument of the calls to find_constant:

# try to find constants for op(x, x) == c, op(x, c) == x and op(c, x) == x
for opname in opnames2:
    result, valid_if = z3_expression(opname, xvar, xvar)
    for const in find_constant(z3.And(valid_if, result == constvar)):
        print(f"{opname}(x, x) -> {const}")
# try to find constants for op(x, c) == x and op(c, x) == x
for opname in opnames2:
    result, valid_if = z3_expression(opname, xvar, constvar)
    for const in find_constant(z3.And(result == xvar, valid_if)):
        print(f"{opname}(x, {const}) -> x")
    result, valid_if = z3_expression(opname, constvar, xvar)
    for const in find_constant(z3.And(result == xvar, valid_if)):
        print(f"{opname}({const}, x) -> x")

And we get this list instead:

int_sub(x, x) -> 0
int_xor(x, x) -> 0
int_eq(x, x) -> 1
int_ne(x, x) -> 0
int_lt(x, x) -> 0
int_le(x, x) -> 1
int_gt(x, x) -> 0
int_ge(x, x) -> 1
uint_lt(x, x) -> 0
uint_le(x, x) -> 1
uint_gt(x, x) -> 0
uint_ge(x, x) -> 1
int_add(x, 0) -> x
int_add(0, x) -> x
int_sub(x, 0) -> x
int_mul(x, 1) -> x
int_mul(1, x) -> x
int_and(x, -1) -> x
int_and(-1, x) -> x
int_or(x, 0) -> x
int_or(0, x) -> x
int_xor(x, 0) -> x
int_xor(0, x) -> x
int_lshift(x, 0) -> x
int_rshift(x, 0) -> x
uint_rshift(x, 0) -> x
int_pydiv(x, 1) -> x

Synthesizing two constants

For the patterns op(x, c1) == c2 and op(c1, x) == c2 we need to synthesize two constants. We can again write a helper method for that:

def find_2consts(z3expr, number_of_results=5):
    condition = z3.ForAll(
    for i in range(number_of_results):
        checkres = solver.check(condition)
        if checkres == z3.sat:
            model = solver.model()
            const = model[constvar].as_signed_long()
            const2 = model[constvar2].as_signed_long()
            yield const, const2
            condition = z3.And(z3.Or(constvar != const, constvar2 != const2), condition)

And then use it like this:

for opname in opnames2:
    # try to find constants c1, c2 such that op(c1, x) -> c2
    result, valid_if = z3_expression(opname, constvar, xvar)
    consts = find_2consts(z3.And(valid_if, result == constvar2))
    for const, const2 in consts:
        print(f"{opname}({const}, x) -> {const2}")
    # try to find constants c1, c2 such that op(x, c1) -> c2
    result, valid_if = z3_expression(opname, xvar, constvar)
    consts = find_2consts(z3.And(valid_if, result == constvar2))
    for const, const2 in consts:
        print("%s(x, %s) -> %s" % (opname, const, const2))

Which yields some straightforward simplifications:

int_mul(0, x) -> 0
int_mul(x, 0) -> 0
int_and(0, x) -> 0
int_and(x, 0) -> 0
uint_lt(x, 0) -> 0
uint_le(0, x) -> 1
uint_gt(0, x) -> 0
uint_ge(x, 0) -> 1
int_lshift(0, x) -> 0
int_rshift(0, x) -> 0
uint_rshift(0, x) -> 0
uint_mul_high(0, x) -> 0
uint_mul_high(1, x) -> 0
uint_mul_high(x, 0) -> 0
uint_mul_high(x, 1) -> 0
int_pymod(x, 1) -> 0
int_pymod(x, -1) -> 0

A few require a bit more thinking:

int_or(-1, x) -> -1
int_or(x, -1) -> -1

The are true because in two's complement, -1 has all bits set.

The following ones require recognizing that -9223372036854775808 == -2**63 is the most negative signed 64-bit integer, and 9223372036854775807 == 2 ** 63 - 1 is the most positive one:

int_lt(9223372036854775807, x) -> 0
int_lt(x, -9223372036854775808) -> 0
int_le(-9223372036854775808, x) -> 1
int_le(x, 9223372036854775807) -> 1
int_gt(-9223372036854775808, x) -> 0
int_gt(x, 9223372036854775807) -> 0
int_ge(9223372036854775807, x) -> 1
int_ge(x, -9223372036854775808) -> 1

The following ones are true because the bitpattern for -1 is the largest unsigned number:

uint_lt(-1, x) -> 0
uint_le(x, -1) -> 1
uint_gt(x, -1) -> 0
uint_ge(-1, x) -> 1

Strength Reductions

All the patterns so far only had a variable or a constant on the target of the rewrite. We can also use the machinery to do strengh-reductions where we generate a single-argument operation op1(x) for input operations op(x, c1) or op(c1, x). To achieve this, we try all combinations of binary and unary operations. (We won't consider strength reductions where a binary operation gets turned into a "cheaper" other binary operation here.)

opnames1 = [

for opname in opnames2:
    for opname1 in opnames1:
        result, valid_if = z3_expression(opname, xvar, constvar)
        # try to find a constant op(x, c) == g(x)
        result1, valid_if1 = z3_expression(opname1, xvar)
        consts = find_constant(z3.And(valid_if, valid_if1, result == result1))
        for const in consts:
            print(f"{opname}(x, {const}) -> {opname1}(x)")

        # try to find a constant op(c, x) == g(x)
        result, valid_if = z3_expression(opname, constvar, xvar)
        result1, valid_if1 = z3_expression(opname1, xvar)
        consts = find_constant(z3.And(valid_if, valid_if1, result == result1))
        for const in consts:
            print(f"{opname}({const}, x) -> {opname1}(x)")

Which yields the following new simplifications:

int_sub(0, x) -> int_neg(x)
int_sub(-1, x) -> int_invert(x)
int_mul(x, -1) -> int_neg(x)
int_mul(-1, x) -> int_neg(x)
int_xor(x, -1) -> int_invert(x)
int_xor(-1, x) -> int_invert(x)
int_eq(x, 0) -> int_is_zero(x)
int_eq(0, x) -> int_is_zero(x)
int_ne(x, 0) -> int_is_true(x)
int_ne(0, x) -> int_is_true(x)
uint_lt(0, x) -> int_is_true(x)
uint_lt(x, 1) -> int_is_zero(x)
uint_le(1, x) -> int_is_true(x)
uint_le(x, 0) -> int_is_zero(x)
uint_gt(x, 0) -> int_is_true(x)
uint_gt(1, x) -> int_is_zero(x)
uint_ge(x, 1) -> int_is_true(x)
uint_ge(0, x) -> int_is_zero(x)
int_pydiv(x, -1) -> int_neg(x)


With not very little code we managed to generate a whole lot of local simplifications for integer operations in the IR of PyPy's JIT. The rules discovered that way are "simple", in the sense that they only require looking at a single instruction, and not where the arguments of that instruction came from. They also don't require any knowledge about the properties of the arguments of the instructions (e.g. that they are positive).

The rewrites in this post have mostly been in PyPy's JIT already. But now we mechanically confirmed that they are correct. I've also added the remaining useful looking ones, in particular int_eq(x, 0) -> int_is_zero(x) etc.

If we wanted to scale this approach up, we would have to work much harder! There are a bunch of problems that come with generalizing the approach to looking at sequences of instructions:

In the next blog post I'll discuss an alternative approach to simply generating all possible sequences of instructions, that tries to address these problems. This works by analyzing the real traces of benchmarks and mining those for inefficiencies, which only shows problems that occur in actual programs.


I've been re-reading a lot of blog posts from John's blog:

but also papers:

Another of my favorite blogs has been Philipp Zucker's blog in the last year or two, lots of excellent posts about/using Z3 on there.

July 12, 2024 07:14 PM UTC

Python Morsels

What are lists in Python?

Lists are used to store and manipulate an ordered collection of things.

Table of contents

  1. Lists are ordered collections
  2. Containment checking
  3. Length
  4. Modifying the contents of a list
  5. Indexing: looking up items by their position
  6. Lists are the first data structure to learn

Lists are ordered collections

This is a list:

>>> colors = ["purple", "green", "blue", "yellow"]

We can prove that to ourselves by passing that object to Python's built-in type function:

>>> type(colors)
<class 'list'>

Lists are ordered collections of things.

We can create a new list by using square brackets ([]), and inside those square brackets, we put each of the items that we'd like our list to contain, separated by commas:

>>> numbers = [2, 1, 3, 4, 7, 11]
>>> numbers
[2, 1, 3, 4, 7, 11]

Lists can contain any type of object. Each item in a list doesn't need to be of the same type, but in practice, they typically are.

So we might refer to this as a list of strings:

>>> colors = ["purple", "green", "blue", "yellow"]

While this is a list of numbers:

>>> numbers = [2, 1, 3, 4, 7, 11]

Containment checking

We can check whether a …

Read the full article:

July 12, 2024 03:09 PM UTC

Peter Bengtsson

Converting Celsius to Fahrenheit with Python

Starting at 4°C, add +12 to the Celcius and mirror the number to get the Fahrenheit number.

July 12, 2024 03:08 PM UTC

Python Software Foundation

Announcing Our New PyPI Support Specialist!

We are thrilled to announce that our first-ever search for a dedicated PyPI Support Specialist has concluded with the hire of Maria Ashna, the newest member of the Python Software Foundation (PSF) staff. Reporting to Ee Durbin, Director of Infrastructure, Maria joins us from a background in academic research, technical consulting, and theatre.

Maria will help the PSF to support one of our most critical services, the Python Package Index (PyPI). Over the past 23 years, PyPI has seen essentially exponential growth in traffic and users, relying for the most part on volunteers to support it. With the addition of requirements to keep all Python maintainers and users safe, our support load has outstretched our support resources for some time now. The Python Software Foundation committed to hiring to increase this capacity in April and we’re excited to have Maria on board to begin providing crucially needed support.

From Maria, “I am a firm believer in democratizing tech. The Open Source community is the lifeblood of such democratization, which is why I am excited to be part of PSF and to serve this community.”

As you see Maria around the PyPI support inbox, issue tracker, and in the future we hope that you’ll extend a warm welcome! We’re eager to get her up and running to reduce the stress that users have been experiencing around PyPI support and further our work to improve and extend PyPI sustainably.

July 12, 2024 01:12 PM UTC

Real Python

The Real Python Podcast – Episode #212: Digging Into Graph Theory in Python With David Amos

Have you wondered about graph theory and how to start exploring it in Python? What resources and Python libraries can you use to experiment and learn more? This week on the show, former co-host David Amos returns to talk about what he's been up to and share his knowledge about graph theory in Python.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 12, 2024 12:00 PM UTC

Talk Python to Me

#470: Python in Medicine and Patient Care

Python is special. It's used by the big tech companies but also by those you would rarely classify as developers. On this episode, we get a look inside how Python is being used at a Children's Hospital to speed and improve patient care. We have Dr. Somak Roy here to share how he's using Python in his day to day job to help kids get well a little bit faster.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href=''>Sentry Error Monitoring, Code TALKPYTHON</a><br> <a href=''>Posit</a><br> <a href=''>Talk Python Courses</a><br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Somak Roy</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Cincinnati Children's Hospital</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>CNVkit: Genome-wide copy number</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>cnaplotr</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>hgvs</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>openpyxl</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Hera is an Argo Python SDK</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>insiM: in silico Mutator software for bioinformatics</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Bamsurgeon</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>pysam - An interface for reading and writing SAM files</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Scientists rename human genes to stop Microsoft Excel from misreading them as dates</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>BioPython</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Watch this episode on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Episode transcripts</b>: <a href="" target="_blank" rel="noopener"></a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>

July 12, 2024 08:00 AM UTC

Matt Layman

Trial Banner Inclusion Tag - Building SaaS #195

In this episode, we worked on a trial banner that could persist across all pages on the site. Because the banner needed data that was only available on the index page, we had to refactor the banner into an inclusion template tag to make the tag work consistently.

July 12, 2024 12:00 AM UTC

Quansight Labs Blog

Free-threaded CPython is ready to experiment with!

An overview of the ongoing efforts to improve and roll out support for free-threaded CPython throughout the Python open source ecosystem

July 12, 2024 12:00 AM UTC

July 11, 2024

Python Software Foundation

Announcing Our New Infrastructure Engineer

We are excited to announce that Jacob Coffee has joined the Python Software Foundation staff as an Infrastructure Engineer bringing his experience as an Open Source maintainer, dedicated homelab maintainer, and professional systems administrator to the team. Jacob will be the second member of our Infrastructure staff, reporting to Director of Infrastructure, Ee Durbin.

Joining our team, Jacob will share the responsibility of maintaining the PSF systems and services that serve the Python community, CPython development, and our internal operations. This will add crucially needed redundancy to the team as well as capacity to undertake new initiatives with our infrastructure.

Jacob shares, “I’m living the dream by supporting the PSF mission AND working in open source! I’m thrilled to be a part of the PSF team and deepen my contributions to the Python community.”

In just the first few days, Jacob has already shown initiative on multiple projects and issues throughout the infrastructure and we’re excited to see the impact he’ll have on the PSF and broader Python community. We hope that you’ll wish him a warm welcome as you see him across the repos, issue trackers, mailing lists, and discussion forums!

July 11, 2024 02:34 PM UTC

Nicola Iarocci

Microsoft MVP

Last night, I was at an outdoor theatre with Serena, watching Anatomy of a Fall (an excellent film). Outdoor theatres are becoming rare, which is a pity, and Arena del Sole is lovely with its strong vintage, 80s vibe. There’s little as pleasant as watching a film under the stars with your loved one on a quiet summer evening.

Anyway, in the pause, I glanced at my e-mails and discovered I had been again granted the Microsoft MVP Award. It is the ninth consecutive year, and I’m grateful and happy the journey continues. At this point, I should put in some extra effort to reach the 10-year milestone next year.

July 11, 2024 01:11 PM UTC

Real Python

Quiz: Build a Blog Using Django, GraphQL, and Vue

In this quiz, you’ll test your understanding of building a Django blog back end and a Vue front end, using GraphQL to communicate between them.

You’ll revisit how to run the Django server and a Vue application on your computer at the same time.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 11, 2024 12:00 PM UTC

Robin Wilson

Searching an aerial photo with text queries – a demo and how it works

Summary: I’ve created a demo web app where you can search an aerial photo of Southampton, UK using text queries such as "roundabout", "tennis court" or "ship". It uses vector embeddings to do this – which I explain in this blog post.

In this post I’m going to try and explain a bit more about how this works.

Firstly, I should explain that the only data used for the searching is the aerial image data itself – even though a number of these things will be shown on the OpenStreetMap map, none of that data is used, so you can also search for things that wouldn’t be shown on a map (like a blue bus)

The main technique that lets us do this is vector embeddings. I strongly suggest you read Simon Willison’s great article/talk on embeddings but I’ll try and explain here too. An embedding model lets you turn a piece of data (for example, some text, or an image) into a constant-length vector – basically just a sequence of numbers. This vector would look something like [0.283, -0.825, -0.481, 0.153, ...] and would be the same length (often hundreds or even thousands of elements long) regardless how long the data you fed into it was.

In this case, I’m using the SkyCLIP model which produces vectors that are 768 elements long. One of the key features of these vectors are that the model is trained to produce similar vectors for things that are similar in some way. For example, a text embedding model may produce a similar vector for the words "King" and "Queen", or "iPad" and "tablet". The ‘closer’ a vector is to another vector, the more similar the data that produced it.

The SkyCLIP model was trained on image-text pairs – so a load of images that had associated text describing what was in the image. SkyCLIP’s training data "contains 5.2 million remote sensing image-text pairs in total, covering more than 29K distinct semantic tags" – and these semantic tags and the text descriptions of them were generated from OpenStreetMap data.

Once we’ve got the vectors, how do we work out how close vectors are? Well, we can treat the vectors as encoding a point in 768-dimensional space. That’s a bit difficult to visualise – so imagine a point in 2- or 3-dimensional space as that’s easier, plotted on a graph. Vectors for similar things will be located physically closer on the graph – and one way of calculating similarity between two vectors is just to measure the multi-dimensional distance on a graph. In this situation we’re actually using cosine similarity, which gives a number between -1 and +1 representing the similarity of two vectors.

So, we now have a way to calculate an embedding vector for any piece of data. The next step we take is to split the aerial image into lots of little chunks – we call them ‘image chips’ – and calculate the embedding of each of those chunks, and then compare them to the embedding calculated from the text query.

I used the RasterVision library for this, and I’ll show you a bit of the code. First, we generate a sliding window dataset, which will allow us to then iterate over image chips. We define the size of the image chip to be 200×200 pixels, with a ‘stride’ of 100 pixels which means each image chip will overlap the ones on each side by 100 pixels. We then configure it to resize the output to 224×224 pixels, which is the size that the SkyCLIP model expects as input.

ds = SemanticSegmentationSlidingWindowGeoDataset.from_uris(
        image_raster_source_kw=dict(channel_order=[0, 1, 2]),

We then iterate over all of the image chips, run the model to calculate the embedding and stick it into a big array:

dl = DataLoader(ds, batch_size=24)

embs = torch.zeros(len(ds), EMBEDDING_DIM_SIZE)

with torch.inference_mode(), tqdm(dl, desc='Creating chip embeddings') as bar:
    i = 0
    for x, _ in bar:
        x =
        emb = model.encode_image(x)
        embs[i:i + len(x)] = emb.cpu()
        i += len(x)

# normalize the embeddings
embs /= embs.norm(dim=-1, keepdim=True)


We also do a fair amount of fiddling around to get the locations of each chip and store those too.

Once we’ve stored all of those (I’ll get on to storage in a moment), we need to calculate the embedding of the text query too – which can be done with code like this:

text = tokenizer(text_queries)
with torch.inference_mode():
        text_features = model.encode_text(
        text_features /= text_features.norm(dim=-1, keepdim=True)
        text_features = text_features.cpu()

It’s then ‘just’ a matter of comparing the text query embedding to the embeddings of all of the image chips, and finding the ones that are closest to each other.

To do this, we can use a vector database. There are loads of different vector databases to choose from, but I’d recently been to a tutorial at PyData Southampton (I’m one of the co-organisers, and I strongly recommend attending if you’re in the area) which used the Pinecone serverless vector database, and they have a fairly generous free tier, so I thought I’d try that.

Pinecone, like all other vector databases, allows you to insert a load of vectors and their metadata (in this case, their location in the image) into the database, and then search the database to find the vectors closest to a ‘search vector’ you provide.

I won’t bother showing you all the code for this side of things: it’s fairly standard code for calling Pinecone APIs, mostly copied from their tutorials.

I then wrapped this all up in a FastAPI API, and put a simple Javascript front-end on it to display the results on a Leaflet web map. I also added some basic caching to stop us hitting the Pinecone API too frequently (as there is limit to the number of API calls you can make on the free plan). And that’s pretty-much it.

I hope the explanation made sense: have a play with the app here and post a comment with any questions.

July 11, 2024 09:35 AM UTC

July 10, 2024

Real Python

How Do You Choose Python Function Names?

One of the hardest decisions in programming is choosing names. Programmers often use this phrase to highight the challenges of selecting Python function names. It may be an exaggeration, but there’s still a lot of truth in it.

There are some hard rules you can’t break when naming Python functions and other objects. There are also other conventions and best practices that don’t raise errors when you break them, but they’re still important when writing Pythonic code.

Choosing the ideal Python function names makes your code more readable and easier to maintain. Code with well-chosen names can also be less prone to bugs.

In this tutorial, you’ll learn about the rules and conventions for naming Python functions and why they’re important. So, how do you choose Python function names?

Get Your Code: Click here to download the free sample code that you’ll use as you learn how to choose Python function names.

In Short: Use Descriptive Python Function Names Using snake_case

In Python, the labels you use to refer to objects are called identifiers or names. You set a name for a Python function when you use the def keyword.

When creating Python names, you can use uppercase and lowercase letters, the digits 0 to 9, and the underscore (_). However, you can’t use digits as the first character. You can use some other Unicode characters in Python identifiers, but not all Unicode characters are valid. Not even 🐍 is valid!

Still, it’s preferable to use only the Latin characters present in ASCII. The Latin characters are easier to type and more universally found on most keyboards. Using other characters rarely improves readability and can be a source of bugs.

Here are some syntactically valid and invalid names for Python functions and other objects:

Name Validity Notes
number Valid
first_name Valid
first name Invalid No whitespace allowed
first_10_numbers Valid
10_numbers Invalid No digits allowed at the start of names
_name Valid
greeting! Invalid No ASCII punctuation allowed except for the underscore (_)
café Valid Not recommended
你好 Valid Not recommended
hello⁀world Valid Not recommended—connector punctuation characters and other marks are valid characters

However, Python has conventions about naming functions that go beyond these rules. One of the core Python Enhancement Proposals, PEP 8, defines Python’s style guide, which includes naming conventions.

According to PEP 8 style guidelines, Python functions should be named using lowercase letters and with an underscore separating words. This style is often referred to as snake case. For example, get_text() is a better function name than getText() in Python.

Function names should also describe the actions being performed by the function clearly and concisely whenever possible. For example, for a function that calculates the total value of an online order, calculate_total() is a better name than total().

You’ll explore these conventions and best practices in more detail in the following sections of this tutorial.

What Case Should You Use for Python Function Names?

Several character cases, like snake case and camel case, are used in programming for identifiers to name the various entities. Programming languages have their own preferences, so the right style for one language may not be suitable for another.

Python functions are generally written in snake case. When you use this format, all the letters are lowercase, including the first letter, and you use an underscore to separate words. You don’t need to use an underscore if the function name includes only one word. The following function names are examples of snake case:

  • find_winner()
  • save()

Both function names include lowercase letters, and one of them has two English words separated by an underscore. You can also use the underscore at the beginning or end of a function name. However, there are conventions outlining when you should use the underscore in this way.

You can use a single leading underscore, such as with _find_winner(), to indicate that a function is meant only for internal use. An object with a leading single underscore in its name can be used internally within a module or a class. While Python doesn’t enforce private variables or functions, a leading underscore is an accepted convention to show the programmer’s intent.

A single trailing underscore is used by convention when you want to avoid a conflict with existing Python names or keywords. For example, you can’t use the name import for a function since import is a keyword. You can’t use keywords as names for functions or other objects. You can choose a different name, but you can also add a trailing underscore to create import_(), which is a valid name.

You can also use a single trailing underscore if you wish to reuse the name of a built-in function or other object. For example, if you want to define a function that you’d like to call max, you can name your function max_() to avoid conflict with the built-in function max().

Unlike the case with the keyword import, max() is not a keyword but a built-in function. Therefore, you could define your function using the same name, max(), but it’s generally preferable to avoid this approach to prevent confusion and ensure you can still use the built-in function.

Double leading underscores are also used for attributes in classes. This notation invokes name mangling, which makes it harder for a user to access the attribute and prevents subclasses from accessing them. You’ll read more about name mangling and attributes with double leading underscores later.

Read the full article at »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 10, 2024 02:00 PM UTC

Quiz: Choosing the Best Font for Programming

In this quiz, you’ll test your understanding of how to choose the best font for your daily programming. You’ll get questions about the technicalities and features to consider when choosing a programming font and refresh your knowledge about how to spot a high-quality coding font.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 10, 2024 12:00 PM UTC

July 09, 2024

PyCoder’s Weekly

Issue #637 (July 9, 2024)

#637 – JULY 9, 2024
View in Browser »

The PyCoder’s Weekly Logo

Python Grapples With Apple App Store Rejections

A string that is part of the urllib parser module in Python references a scheme for apps that use the iTunes feature to install other apps, which is disallowed. Auto scanning by Apple is rejecting any app that uses Python 3.12 underneath. A solution has been proposed for Python 3.13.

Python’s Built-in Functions: A Complete Exploration

In this tutorial, you’ll learn the basics of working with Python’s numerous built-in functions. You’ll explore how you can use these predefined functions to perform common tasks and operations, such as mathematical calculations, data type conversions, and string manipulations.

Python Error and Performance Monitoring That Doesn’t Suck


With Sentry, you can trace issues from the frontend to the backend—detecting slow and broken code, to fix what’s broken faster. Installing the Python SDK is super easy and PyCoder’s Weekly subscribers get three full months of the team plan. Just use code “pycoder” on signup →
SENTRY sponsor

Constraint Programming Using CP-SAT and Python

Constraint programming is the process of looking for solutions based on a series of restrictions, like employees over 18 who have worked the cash before. This article introduces the concept and shows you how to use open source libraries to write constraint solving code.

Register for PyOhio, July 27-28

PYOHIO.ORG • Shared by Anurag Saxena

Psycopg 3.2 Released


Polars 1.0 Released


Quiz: Python’s Magic Methods



Any Web Devs Successfully Pivoted to AI/ML Development?


Articles & Tutorials

A Search Engine for Python Packages

Finding the right Python package on PyPI can be a bit difficult, since PyPI isn’t really designed for discovering packages easily. solves this problem by allowing you to search using descriptions like “A package that makes nice plots and visualizations.”
PYPISCOUT.COM • Shared by Florian Maas

Programming Advice I’d Give Myself 15 Years Ago

Marcus writes in depth about things he has learned in his coding career and wished he new earlier in his journey. Thoughts include fixing foot guns, understanding the pace-quality trade-off, sharpening your axe, and more. Associated HN Discussion.

Keeping Things in Sync: Derive vs Test

Don’t Repeat Yourself (DRY) is generally a good coding philosophy, but it shouldn’t be adhered to blindly. There are other alternatives, like using tests to make sure that duplication stays in sync. This article outlines the why and how of just that.

8 Versions of UUID and When to Use Them

RFC 9562 outlines the structure of Universally Unique IDentifiers (UUIDs) and includes eight different versions. In this post, Nicole gives a quick intro to each kind so you don’t have to read the docs, and explains why you might choose each.

Defining Python Constants for Code Maintainability

In this video course, you’ll learn how to properly define constants in Python. By coding a bunch of practical example, you’ll also learn how Python constants can improve your code’s readability, reusability, and maintainability.

Django: Test for Pending Migrations

The makemigrations --check command tells you if there is missing migrations in your Django project, but you have to remember to run it. Adam suggests calling it from a test so it gets triggered as part of your CI/CD process.

How to Maximize Your Experience at EuroPython 2024

Conferences can be overwhelming, with lots going on and lots of choices. This post talks about how to get the best experience at EuroPython, or any conference.

Polars vs. pandas: What’s the Difference?

Explore the key distinctions between Polars and Pandas, two data manipulation tools. Discover which framework suits your data processing needs best.

An Overview of the Sparse Array Ecosystem for Python

An overview of the different options available for working with sparse arrays in Python.

Projects & Code

Get Space Weather Data

GITHUB.COM/BEN-N93 • Shared by Ben Nour

AI to Drop Hats


amphi-etl: Low-Code ETL for Data


aurora: Static Site Generator Implemented in Python


pytest-edit: pytest --edit to Open Failing Test Code



Weekly Real Python Office Hours Q&A (Virtual)

July 10, 2024

PyCon Nigeria 2024

July 10 to July 14, 2024

PyData Eindhoven 2024

July 11 to July 12, 2024

Python Atlanta

July 11 to July 12, 2024

PyDelhi User Group Meetup

July 13, 2024

DFW Pythoneers 2nd Saturday Teaching Meeting

July 13, 2024

Happy Pythoning!
This was PyCoder’s Weekly Issue #637.
View in Browser »


[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

July 09, 2024 07:30 PM UTC

Python Engineering at Microsoft

Python in Visual Studio Code – July 2024 Release

We’re excited to announce the July 2024 release of the Python and Jupyter extensions for Visual Studio Code!

This release includes the following announcements:

If you’re interested, you can check the full list of improvements in our changelogs for the Python, Jupyter and Pylance extensions.

Enhanced environment discovery with python-environment-tools

We are excited to introduce a new tool, python-environment-tools, designed to significantly enhance the speed of detecting global Python installations and Python virtual environments.

This tool leverages Rust to ensure a rapid and accurate discovery process. It also minimizes the number of Input/Output operations by collecting all necessary environment information at once, significantly enhancing the overall performance.

We are currently testing this new feature in the Python extension, running it in parallel with the existing support, to evaluate the new discovery performance. Consequently, you will see a new logging channel called Python Locator that shows the discovery times with this new tool.

Python Locator output channel.

This enhancement is part of our ongoing efforts to optimize the performance and efficiency of Python support in VS Code. Visit the python-environment-tools repo to learn more about this feature, ongoing work, and provide feedback!

Improved support for reStructuredText docstrings with Pylance

Pylance has improved support for rendering reStructuredText documentation strings (docstrings) on hover! RestructuredText (RST) is a popular format for documentation, and its syntax is sometimes used for the docstrings of Python packages.

This feature is in its early stages and is currently behind an experimental flag as we work to ensure it handles various Sphinx, Google Doc, and Epytext scenarios effectively. To try it out, you can enable the experimental setting python.analysis.supportRestructuredText.

Docstring displayed when hovering over the panel module.

Common packages where you might observe this change in their docstrings include pandas and scipy. Try this change out, and report any issues or feedback at the Pylance GitHub repository.

Note: This setting is currently experimental, but will likely be enabled by default in the future as it becomes more stabilized.

Community contributed Pixi support

Thanks to @baszalmstra, there is now support for Pixi environment detection in the Python extension! This work added a locator to detect Pixi environments in your workspace similar to other common environments such as Conda. Furthermore, if a Pixi environment is detected in your workspace, the environment will automatically be selected as your default environment.

We appreciate and look forward to continued collaboration with community members on bug fixes and enhancements to improve the Python experience!

Other Changes and Enhancements

We have also added small enhancements and fixed issues requested by users that should improve your experience working with Python and Jupyter Notebooks in Visual Studio Code. Some notable changes include:

We would also like to extend special thanks to this month’s contributors:

Call for Community Feedback

As we are planning and prioritizing future work, we value your feedback! Below are a few issues we would love feedback on:

Try out these new improvements by downloading the Python extension and the Jupyter extension from the Marketplace, or install them directly from the extensions view in Visual Studio Code (Ctrl + Shift + X or ⌘ + ⇧ + X). You can learn more about Python support in Visual Studio Code in the documentation. If you run into any problems or have suggestions, please file an issue on the Python VS Code GitHub page.

The post Python in Visual Studio Code – July 2024 Release appeared first on Python.

July 09, 2024 03:45 PM UTC

Real Python

Customize VS Code Settings

Visual Studio Code, is an open-source code editor available on all platforms. It’s also a great platform for Python development. The default settings in VS Code present a somewhat cluttered environment.

This Code Conversation with instructor Philipp Acsany is about learning how to customize the settings within the interface of VS Code. Having a clean digital workspace is an important part of your work life. Removing distractions and making code more readable can increase productivity and even help you spot bugs.

In this Code Conversation, you’ll learn how to:

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 09, 2024 02:00 PM UTC

Django Weblog

Django security releases issued: 5.0.7 and 4.2.14

In accordance with our security release policy, the Django team is issuing releases for Django 5.0.7 and Django 4.2.14. These releases address the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2024-38875: Potential denial-of-service in django.utils.html.urlize()

urlize() and urlizetrunc() were subject to a potential denial-of-service attack via certain inputs with a very large number of brackets.

Thanks to Elias Myllymäki for the report.

This issue has severity "moderate" according to the Django security policy.

CVE-2024-39329: Username enumeration through timing difference for users with unusable passwords

The django.contrib.auth.backends.ModelBackend.authenticate() method allowed remote attackers to enumerate users via a timing attack involving login requests for users with unusable passwords.

This issue has severity "low" according to the Django security policy.

CVE-2024-39330: Potential directory-traversal in

Derived classes of the base class which override generate_filename() without replicating the file path validations existing in the parent class, allowed for potential directory-traversal via certain inputs when calling save().

Built-in Storage sub-classes were not affected by this vulnerability.

Thanks to Josh Schneier for the report.

This issue has severity "low" according to the Django security policy.

CVE-2024-39614: Potential denial-of-service in django.utils.translation.get_supported_language_variant()

get_supported_language_variant() was subject to a potential denial-of-service attack when used with very long strings containing specific characters.

To mitigate this vulnerability, the language code provided to get_supported_language_variant() is now parsed up to a maximum length of 500 characters.

Thanks to MProgrammer for the report.

This issue has severity "moderate" according to the Django security policy.

Affected supported versions

  • Django main branch
  • Django 5.1 (currently at beta status)
  • Django 5.0
  • Django 4.2


Patches to resolve the issue have been applied to Django's main, 5.1, 5.0, and 4.2 branches. The patches may be obtained from the following changesets.

CVE-2024-38875: Potential denial-of-service in django.utils.html.urlize()

CVE-2024-39329: Username enumeration through timing difference for users with unusable passwords

CVE-2024-39330: Potential directory-traversal in

CVE-2024-39614: Potential denial-of-service in django.utils.translation.get_supported_language_variant()

The following releases have been issued

The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to, and not via Django's Trac instance, nor via the Django Forum, nor via the django-developers list. Please see our security policies for further information.

July 09, 2024 02:00 PM UTC

Real Python

Quiz: Split Your Dataset With scikit-learn's train_test_split()

In this quiz, you’ll test your understanding of how to use train_test_split() from the sklearn library.

By working through this quiz, you’ll revisit why you need to split your dataset in supervised machine learning, which subsets of the dataset you need for an unbiased evaluation of your model, how to use train_test_split() to split your data, and how to combine train_test_split() with prediction methods.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 09, 2024 12:00 PM UTC

Python Bytes

#391 A weak episode

<strong>Topics covered in this episode:</strong><br> <ul> <li><a href=""><strong>Vendorize packages from PyPI</strong></a></li> <li><a href=""><strong>A Guide to Python's Weak References Using weakref Module</strong></a></li> <li><a href=""><strong>Making Time Speak</strong></a></li> <li><a href=""><strong>How Should You Test Your Machine Learning Project? A Beginner’s Guide</strong></a></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="391">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by <strong>Code Comments</strong>, an original podcast from RedHat: <a href=""></a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href=""><strong></strong></a></li> <li>Brian: <a href=""><strong></strong></a></li> <li>Show: <a href=""><strong></strong></a></li> </ul> <p>Join us on YouTube at <a href=""><strong></strong></a> to be part of the audience. Usually Tuesdays at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1:</strong> <a href=""><strong>Vendorize packages from PyPI</strong></a></p> <ul> <li>Allows pure-Python dependencies to be vendorized: that is, the Python source of the dependency is copied into your own package.</li> <li>Best used for small, pure-Python dependencies</li> </ul> <p><strong>Brian #2:</strong> <a href=""><strong>A Guide to Python's Weak References Using weakref Module</strong></a></p> <ul> <li>Martin Heinz</li> <li>Very cool discussion of weakref</li> <li>Quick garbage collection intro, and how references and weak references are used.</li> <li>Using weak references to build data structures. <ul> <li>Example of two kinds of trees</li> </ul></li> <li>Implementing the Observer pattern</li> <li>How logging and OrderedDict use weak references</li> </ul> <p><strong>Michael #3:</strong> <a href=""><strong>Making Time Speak</strong></a></p> <ul> <li>by Prayson, a former guest and friend of the show</li> <li>Translating time into human-friendly spoken expressions</li> <li>Example: clock("11:15") # 'quarter past eleven' </li> <li>Features <ul> <li>Convert time into spoken expressions in various languages.</li> <li>Easy-to-use API with a simple and intuitive design.</li> <li>Pure Python implementation with no external dependencies.</li> <li>Extensible architecture for adding support for additional languages using the plugin design pattern.</li> </ul></li> </ul> <p><strong>Brian #4:</strong> <a href=""><strong>How Should You Test Your Machine Learning Project? A Beginner’s Guide</strong></a></p> <ul> <li>François Porcher</li> <li>Using pytest and pytest-cov for testing machine learning projects</li> <li>Lots of pieces can and should be tested just as normal functions. <ul> <li>Example of testing a clean_text(text: str) -> str function</li> </ul></li> <li>Test larger chunks with canned input and expected output. <ul> <li>Example test_tokenize_text()</li> </ul></li> <li>Using fixtures for larger reusable components in testing <ul> <li>Example fixture: bert_tokenizer() with pretrained data</li> </ul></li> <li>Checking coverage</li> </ul> <p><strong>Extras</strong> </p> <p>Michael:</p> <ul> <li><a href="">Twilio Authy Hack</a> <ul> <li><a href="">Google Authenticator is the only option</a>? Really?</li> <li><a href="">Bitwarden to the rescue</a></li> <li>Requires (?) an <a href="">update to their app</a>, whose release notes (v26.1.0) only say “Bug fixes”</li> </ul></li> <li><a href="">Introducing Docs in Proton Drive</a> <ul> <li>This is what I called on Mozilla to do in “<a href="">Unsolicited</a><a href=""> Advice for Mozilla and Firefox</a>” But Proton got there first</li> </ul></li> <li>Early bird ending for <a href="">Code in a Castle course</a></li> </ul> <p><strong>Joke:</strong> <a href="">I Lied</a></p>

July 09, 2024 08:00 AM UTC

July 08, 2024

Real Python

Python News Roundup: July 2024

Summer isn’t all holidays and lazy days at the beach. Over the last month, two important players in the data science ecosystem released new major versions. NumPy published version 2.0, which comes with several improvements but also some breaking changes. At the same time, Polars reached its version 1.0 milestone and is now considered production-ready.

PyCon US was hosted in Pittsburgh, Pennsylvania in May. The conference is an important meeting spot for the community and sparked some new ideas and discussions. You can read about some of these in PSF’s coverage of the Python Language Summit, and watch some of the videos posted from the conference.

Dive in to learn more about the most important Python news from the last month.

NumPy Version 2.0

NumPy is a foundational package in the data science space. The library provides in-memory N-dimensional arrays and many functions for fast operations on those arrays.

Many libraries in the ecosystem use NumPy under the hood, including pandas, SciPy, and scikit-learn. The NumPy package has been around for close to twenty years and has played an important role in the rising popularity of Python among data scientists.

The new version 2.0 of NumPy is an important milestone, which adds an improved string type, cleans up the library, and improves performance. However, it comes with some changes that may affect your code.

The biggest breaking changes happen in the C-API of NumPy. Typically, this won’t affect you directly, but it can affect other libraries that you rely on. The community has rallied strongly and most of the bigger packages already support NumPy 2.0. You can check NumPy’s table of ecosystem support for details.

One of the main reasons for using NumPy is that the library can do fast and convenient array operations. For a simple example, the following code calculates square numbers:

>>> numbers = range(10)
>>> [number**2 for number in numbers]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

>>> import numpy as np
>>> numbers = np.arange(10)
>>> numbers**2
array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

First, you use range() and a list comprehension to calculate the first ten square numbers in pure Python. Then, you repeat the calculation with NumPy. Note that you don’t need to explicitly spell out the loop. NumPy handles that for you under the hood.

Furthermore, the NumPy version will be considerably faster, especially for bigger arrays of numbers. One of the secrets to this speed is that NumPy arrays are limited to having one data type, while a Python list can be heterogeneous. One list can contain elements as different as integers, floats, strings, and even nested lists. That’s not possible in a NumPy array.

Improved String Handling

By enforcing all elements to be of the same type that take up the same number of bytes in memory, NumPy can quickly find and work with individual elements. One downside to this has been that strings can be awkward to work with:

>>> words = np.array(["numpy", "python"])
>>> words
array(['numpy', 'python'], dtype='<U6')

>>> words[1] = "monty python"
>>> words
array(['numpy', 'monty '], dtype='<U6')

You first create an array consisting of two strings. Note that NumPy automatically detects that the longest string is six characters long, so it sets aside space for each string to be six characters long. The 6 in the data type string, <U6, indicates this.

Next, you try to replace the second string with a longer string. Unfortunately, only the first six characters are stored since that’s how much space NumPy has set aside for each string in this array. There are ways to work around these limitations, but in NumPy 2.0, you can take advantage of variable length strings instead:

Read the full article at »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

July 08, 2024 02:00 PM UTC

Anwesha Das

Euro Python 2024

It is July, and it is time for Euro Python, and 2024 is my first Euro Python. Some busy days are on the way. Like every other conference, I have my diary, and the conference days are full of various activities.


Day 0 of the main conference

After a long time, I will give a legal talk. We are going to dig into some basics of Intellectual Property. What is it? Why do we need it? What are the different kinds of intellectual property? It is a legal talk designed for developers. So, anyone and everyone from the community with previous knowledge can understand the content and use it to understand their fundamental rights and duties as developers.Intellectual Property 101, the talk is scheduled at 11:35 hrs.

Day 1 of the main conference

Day 1 is PyLadies Day, a day dedicated to PyLadies. We have crafted the day with several different kinds of events. The day opens with a self-defense workshop at 10:30 hrs. PyLadies, throughout the world, aims to provide and foster a safe space for women and friends in the Python Community. This workshop is an extension of that goal. We will learn how to deal with challenging, inappropriate behavior.
In the community, at work, or in any social space. We will have a trained Psychologist as a session guide to help us. This workshop is so important, especially today as it was yesterday and may be in the future (at least until the enforcement of CoC is clear). I am so looking forward to the workshop. Thank you, Mia, Lias and all the PyLadies for organizing this and giving shape to my long-cherished dream.

Then we have my favorite part of the conference, PyLadies Lunch. I crafted the afternoon with a little introduction session, shout-out session, food, fun, laughter, and friends.

After the PyLadies Lunch, I have my only non-PyLadies session, which is a panel discussion on Open Source Sustainability. We will discuss the different aspects of sustainability in the open source space and community.

Again, it is PyLady&aposs time. Here, we have two sessions.

[IAmRemarkable](, to help you learn to empower you by celebrating your achievements and to fight your impostor syndrome. The workshop will help you celebrate your accomplishments and improve your self-promotion skills.

The second session is a 1:1 mentoring event, Meet & Greet with PyLadies. Here, the willing PyLadies will be able to mentor and be mentored. They can be coached in different subjects, starting with programming, learning, things related to job and/or career, etc.

Birds of feather session on Release Management of Open Source projects

It is an open discussion related to the release Management of the Open Source ecosystem.
The discussion includes everything from a community-led project to projects maintained/initiated by a big enterprise, a project maintained by one contributor to a project with several hundreds of contributor bases. What are the different methods we follow regarding versioning, release cadence, and the process itself? Do most of us follow manual processes or depend on automated ones? What works and what does not, and how can we improve our lives? What are the significant points that make the difference? We will discuss and cover the following topics: release management of open source projects, security, automation, CI usage, and documentation. In the discussion, I will share my release automation journey with Ansible. We will follow Chatham House Rules during the discussion to provide the space for open, frank, and collaborative conversation.

So, here comes the days of code, collaboration, and community. See you all there.

PS: I miss my little Py-Lady volunteering at the booth.

July 08, 2024 09:56 AM UTC