Planet Python
Last update: March 21, 2023 04:42 AM UTC
March 21, 2023
Codementor
Python Position and Keyword Only Arguments
What does / and * mean in a Python function definition mean
March 20, 2023
TestDriven.io
Django Performance Optimization Tips
This article looks at where potential performance issues can occur in a Django application and how to address them in order to speed up your app.
Django Weblog
Want to host DjangoCon Europe 2024?
DjangoCon Europe 2023 will be held May 29th-June 2nd in Edinburgh, Scotland, but we're already looking ahead to next year's conference. Could your town - or your football stadium, circus tent, private island or city hall - host this wonderful community event?
Hosting a DjangoCon is an ambitious undertaking. It's hard work, but each year it has been successfully run by a team of community volunteers, not all of whom have had previous experience - more important is enthusiasm, organisational skills, the ability to plan and manage budgets, time and people - and plenty of time to invest in the project.
You'll find plenty of support on offer from previous DjangoCon organisers, so you won't be on your own.
How to apply
If you're interested, we'd love to hear from you. Following the established tradition, the selected hosts will be announced at this year's DjangoCon by last year's organiser but must fall more than one month from DjangoCon US and PyCon US, and EuroPython in the same calendar year. In order to make the announcement at DjangoCon Europe we will need to receive your proposal by May 10.
The more detailed and complete your proposal, the better. Things you should consider, and that we'd like to know about, are:
- dates Ideally between mid May and mid June 2024
- numbers of attendees
- venue(s)
- accommodation
- transport links
- budgets and ticket prices
- committee members
We'd like to see:
- timelines
- pictures
- prices
- draft agreements with providers
- alternatives you have considered
Email you proposals to djangocon-europe-2024-proposals at djangoproject dot com. They will all help show that your plans are serious and thorough and that you have the organisational capacity to make it a success.
We will be hosting a virtual informational session for those that are interested or may be interested in organising a DjangoCon. Please complete indicate your interest here.
If you have any questions or concerns about organising a DjangoCon you can Just drop us a line.
Python Morsels
What is a context manager?
Context managers power Python's with
blocks. They sandwich a code block between enter code and exit code. They're most often used for reusing common cleanup/teardown functionality.

Table of contents
Files opened with with
close automatically
Context managers are objects that can be used in Python's with
statements.
You'll often see with
statements used when working with files in Python.
This code opens a file, uses the f
variable to point to the file object, reads from the file, and then closes the file:
>>> with open("my_file.txt") as f:
... contents = f.read()
...
Notice that we didn't explicitly tell Python to close our file.
But the file did close:
>>> f.closed
True
The file closed automatically when the with
block was exited.
Context managers work in with
statements
Any object that can be …
Read the full article: https://www.pythonmorsels.com/what-is-a-context-manager/
Real Python
Executing Python Scripts With a Shebang
When you read someone else’s Python code, you frequently see a mysterious line, which always appears at the top of the file, starting with the distinctive shebang (#!
) sequence. It looks like a not-so-useful comment, but other than that, it doesn’t resemble anything else you’ve learned about Python, making you wonder what that is and why it’s there. As if that wasn’t enough to confuse you, the shebang line only appears in some Python modules.
In this tutorial, you’ll:
- Learn what a shebang is
- Decide when to include the shebang in Python scripts
- Define the shebang in a portable way across systems
- Pass arguments to the command defined in a shebang
- Know the shebang’s limitations and some of its alternatives
- Execute scripts through a custom interpreter written in Python
To proceed, you should have basic familiarity with the command line and know how to run Python scripts from it. You can also download the supporting materials for this tutorial to follow along with the code examples:
Free Sample Code: Click here to download the free sample code that you’ll use to execute Python scripts with a shebang.
What’s a Shebang, and When Should You Use It?
In short, a shebang is a special kind of comment that you may include in your source code to tell the operating system’s shell where to find the interpreter for the rest of the file:
#!/usr/bin/python3
print("Hello, World!")
If you’re using a shebang, it must appear on the first line in your script, and it has to start with a hash sign (#
) followed by an exclamation mark (!
), colloquially known as the bang, hence the name shebang. The choice of the hash sign to begin this special sequence of characters wasn’t accidental, as many scripting languages use it for inline comments.
You should make sure you don’t put any other comments before the shebang line if you want it to work correctly, or else it won’t be recognized! After the exclamation mark, specify an absolute path to the relevant code interpreter, such as Python. Providing a relative path will have no effect, unfortunately.
Note: The shebang is only recognized by shells, such as Z shell or Bash, running on Unix-like operating systems, including macOS and Linux distributions. It bears no particular meaning in the Windows terminal, which treats the shebang as an ordinary comment by ignoring it.
You can get the shebang to work on Windows by installing the Windows Subsystem for Linux (WSL) that comes with a Unix shell. Alternatively, Windows lets you make a global file association between a file extension like .py
and a program, such as the Python interpreter, to achieve a similar effect.
It’s not uncommon to combine a shebang with the name-main idiom, which prevents the main block of code from running when someone imports the file from another module:
#!/usr/bin/python3
if __name__ == "__main__":
print("Hello, World!")
With this conditional statement, Python will call the print()
function only when you run this module directly as a script—for example, by providing its path to the Python interpreter:
$ python3 /path/to/your/script.py
Hello, World!
As long as the script’s content starts with a correctly defined shebang line and your system user has permission to execute the corresponding file, you can omit the python3
command to run that script:
$ /path/to/your/script.py
Hello, World!
A shebang is only relevant to runnable scripts that you wish to execute without explicitly specifying the program to run them through. You wouldn’t typically put a shebang in a Python module that only contains function and class definitions meant for importing from other modules. Therefore, use the shebang when you don’t want to prefix the command that runs your Python script with python
or python3
.
Note: In the old days of Python, the shebang line would sometimes appear alongside another specially formatted comment described in PEP 263:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
if __name__ == "__main__":
print("Grüß Gott")
The highlighted line used to be necessary to tell the interpreter which character encoding it should use to read your source code correctly, as Python defaulted to ASCII. However, this was only important when you directly embedded non-Latin characters, such as ü or ß, in your code.
This special comment is irrelevant today because modern Python versions use the universal UTF-8 encoding, which can handle such characters with ease. Nevertheless, it’s always preferable to replace tricky characters with their encoded representations using Unicode literals:
>>> "Grüß Gott".encode("unicode_escape")
b'Gr\\xfc\\xdf Gott'
Your foreign colleagues who have different keyboard layouts will thank you for that!
Now that you have a high-level understanding of what a shebang is and when to use it, you’re ready to explore it in more detail. In the next section, you’ll take a closer look at how it works.
How Does a Shebang Work?
Normally, to run a program in the terminal, you must provide the full path to a particular binary executable or the name of a command present in one of the directories listed on the PATH
environment variable. One or more command-line arguments may follow this path or command:
$ /usr/bin/python3 -c 'print("Hello, World!")'
Hello, World!
$ python3 -c 'print("Hello, World!")'
Hello, World!
Here, you run the Python interpreter in a non-interactive mode against a one-liner program passed through the -c
option. In the first case, you provide an absolute path to python3
, while in the second case, you rely on the fact that the parent folder, /usr/bin/
, is included on the search path by default. Your shell can find the Python executable, even if you don’t provide the full path, by looking through the directories on the PATH
variable.
Note: If multiple commands with the same name exist in more than one directory listed on the PATH
variable, then your shell will execute the first it can find. As a result, the outcome of running a command without explicitly specifying the corresponding path may sometimes be surprising. It’ll depend on the order of directories in your PATH
variable. However, this can be useful, as you’ll find out later.
Read the full article at https://realpython.com/python-shebang/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Mike Driscoll
PyDev of the Week: Pierre Raybaut
Today we welcome Pierre Raybaut (@pierreraybaut) as our PyDev of the Week! Pierre is the creator of Spyder, the Scientific Python IDE. Pierre is also the creator of pythonxy and WinPython.
You can see what other projects Pierre is part of over on Pierre’s GitHub Profile.
Now let’s spend some time getting to know Pierre better!

Can you tell us a little about yourself (hobbies, education, etc):
The first code I wrote was an Applesoft BASIC program, on an Apple //e computer… I was 10 years old. Since then I always managed to bring computers in everything I did, at home or at work. As I was an amateur astronomer and was also very fond of physics in general, I chose to follow scientific studies. A few years later, I specialized in optics and photonics and graduated from Institut d’Optique Graduate School, which is now part of Université Paris-Saclay. I then pursued a PhD in the field of femtosecond lasers. Although it was mainly experimental physics, I had the opportunity to develop a code for simulating regenerative amplification in ultra-short pulse lasers; I learned recently that this code is still used today! After my PhD, I worked as a research engineer at THALES Avionics (on developing innovative head-up displays for aircrafts).Then, in 2007, I joined the French Alternative Energies and Atomic Energy Commission (CEA) where I was hired as leading software developer for applications involving image and signal processing as well as scientific instruments control. In 2012, I was given a project management position for the Laser Mégajoule timing and fiducial system development. Four years later, I was appointed head of a research laboratory. Lastly, in 2018 I had the opportunity to join Codra, an industrial software company, as a Project Director. In addition to this position, I am currently the pre-sales manager for the department of engineering at Codra. And of course, I’m also involved in open-source software development since 2008.
Why did you start using Python?
I started using Python in 2008, after a long and meticulous evaluation of various solutions that may fit my needs. Since early 2007 I was part of a research team at CEA. When I joined this team in 2007, every processing and acquisition software was written using commercial software. Some applications were getting huge and complex with a lot of GUIs for editing tons of parameters or visualizing results. Robustness was the main concern, therefore I chose Python since it was providing all the necessary tools for our research work (interactive computing and scientific data plotting) as well as the general-purpose libraries for building stable and robust applications. In 2008, when I started using and promoting Python amongst my colleagues, a piece of the puzzle was still missing: Python had no scientific-oriented IDE! That’s why during my vacations I began coding some tools for filling gaps in Python ecosystem, using Qt GUIs. After writing a variable explorer GUI that could be used directly from a Python interpreter to interact with current namespace, I wrote a Qt-based Python code editor, then a Qt-based Python console… and so on. After a few weeks only, this was done! This ultimately resulted in Spyder (Scientific PYthon Development EnviRonment), a scientific Python IDE that I first released to the public in September 2009: Python was finally a viable alternative to scientific commercial software. Today, thanks to a development team lead by Carlos Cordoba since 2012, Spyder is widely used for data processing and visualization with Python (est. 500,000 downloads/day).
What other programming languages do you know and which is your favorite?
As you know, Python is quite open to other languages. Moreover, when using Python for signal or image processing, it is sometimes necessary to write extensions in C/C++ (or even Fortran) for performance reasons. For example, writing Fortran code for image processing is quite fun, because there is absolutely no interface code to take care of. Cython is also an elegant solution as it allows a progressive optimization of a pure Python algorithm. Finally, on some projects implemented at Codra, I had to make adjustments in code written in C#. I also made some investigations on projects using other languages (Javascript, TypeScript, …). So I’ve been playing with a few languages but Python is the one that gave me most satisfaction, especially when trying to write clean code thanks to quality-related tools like Black, isort or Pylint.
What projects are you working on now?
At Codra, I’m involved in a lot of projects as a Project Director (or technical expert), in various fields like supervisory systems, data acquisition, multi-protocol gateways, data processing, data visualization, etc. From time to time, I even play the role of Project Manager. This is how I’ve been involved lately in CodraFT development, which was supported by CEA. It is available freely on GitHub: this is a Python-Qt based software aiming at processing signals and images and visualizing them. Its main upside is testability: the objective was to create a data processing software with a high level of robustness. Data processing features are mainly based on NumPy, SciPy and scikit-image.
Which Python libraries are your favorite (core or 3rd party)?
At the moment, I’m quite fond of scikit-image for image processing ; nice and clean API, and great documentation. OpenCV is also a great tool available to Python users and provides very efficient pattern detection algorithms for example.
What are some of the big lessons you learned while working on Spyder or WinPython?
I think that the most important lesson I’ve learned during those years is that we need to collaborate with other people. Otherwise, in the end, projects will at best remain as good ideas, or will be discontinued. With Spyder and WinPython, the thing that I’m the most proud of is that I managed to trust someone else to take over the projects and maintain them: in both cases, it was a good decision and projects are still active and popular.
Is there anything else you’d like to say?
I recently add the opportunity to attend a conference around Jupyter (PyData Paris). I really admire the work that has been done around the Jupyter ecosystem. From the IPython version I played with in 2008 to today’s JupyterLab, what an achievement from a technical point of view as well as in terms of community and project management!
Thanks for doing the interview, Pierre!
The post PyDev of the Week: Pierre Raybaut appeared first on Mouse Vs Python.
Django Weblog
Django 4.2 release candidate 1 released
Django 4.2 release candidate 1 is the final opportunity for you to try out the farrago of new features before Django 4.2 is released.
The release candidate stage marks the string freeze and the call for translators to submit translations. Provided no major bugs are discovered that can't be solved in the next two weeks, Django 4.2 will be released on or around April 3. Any delays will be communicated on the Django forum.
Please use this opportunity to help find and fix bugs (which should be reported to the issue tracker). You can grab a copy of the package from our downloads page or on PyPI.
The PGP key ID used for this release is Mariusz Felisiak: 2EF56372BA48CD1B.
Python GUIs
Working With Git and Github in Your Python Projects
Using a version control system (VCS) is crucial for any software development project. These systems allow developers to track changes to the project's codebase over time, removing the need to keep multiple copies of the project folder.
VCSs also facilitate experimenting with new features and ideas without breaking existing functionality in a given project. They also enable collaboration with other developers that can contribute code, documentation, and more.
In this article, we'll learn about Git, the most popular VCS out there. We'll learn everything we need to get started with this VCS and start creating our own repositories. We'll also learn how to publish those repositories to GitHub, another popular tool among developers nowadays.
Installing and Setting Up Git
To use Git in our coding projects, we first need to install it on our computer. To do this, we need to navigate to Git's download page and choose the appropriate installer for our operating system. Once we've downloaded the installer, we need to run it and follow the on-screen instructions.
We can check if everything is working correctly by opening a terminal or command-line window and running git --version
.
Once we've confirmed the successful installation, we should provide Git with some personal information. You'll only need to do this once for every computer. Now go ahead and run the following commands with your own information:
$ git config --global user.name <"YOUR NAME">
$ git config --global user.email <name@email.com>
The first command adds your full name to Git's config file. The second command adds your email. Git will use this information in all your repositories.
If you publish your projects to a remote server like GitHub, then your email address will be visible to anyone with access to that repository. If you don't want to expose your email address this way, then you should create a separate email address to use with Git.
As you'll learn in a moment, Git uses the concept of branches to manage its repositories. A branch is a copy of your project's folder at a given time in the development cycle. The default branch of new repositories is named either master
or main
, depending on your current version of Git.
You can change the name of the default branch by running the following command:
$ git config --global init.defaultBranch <branch_name>
This command will set the name of Git's default branch to branch_name
. Remember that this is just a placeholder name. You need to provide a suitable name for your installation.
Another useful setting is the default text editor Git will use to type in commit messages and other messages in your repo. For example, if you use an editor like Visual Studio Code, then you can configure Git to use it:
# Visual Studio Code
$ git config --global core.editor "code --wait"
With this command, we tell Git to use VS Code to process commit messages and any other message we need to enter through Git.
Finally, to inspect the changes we've made to Git's configuration files, we can run the following command:
$ git config --global -e
This command will open the global .gitconfig
file in our default editor. There, we can fix any error we have made or add new settings. Then we just need to save the file and close it.
Understanding How Git Works
Git works by allowing us to take a snapshot of the current state of all the files in our project's folder. Each time we save one of those snapshots, we make a Git commit. Then the cycle starts again, and Git creates new snapshots, showing how our project looked like at any moment.
Git was created in 2005 by Linus Torvalds, the creator of the Linux kernel. Git is an open-source project that is licensed under the GNU General Public License (GPL) v2. It was initially made to facilitate kernel development due to the lack of a suitable alternative.
The general workflow for making a Git commit to saving different snapshots goes through the following steps:
- Change the content of our project's folder.
- Stage or mark the changes we want to save in our next commit.
- Commit or save the changes permanently in our project's Git database.
As the third step mentions, Git uses a special database called a repository. This database is kept inside your project's directory under a folder called .git
.
Version-Controlling a Project With Git: The Basics
In this section, we'll create a local repository and learn how to manage it using the Git command-line interface (CLI). On macOS and Linux, we can use the default terminal application to follow along with this tutorial.
On Windows, we recommend using Git Bash, which is part of the Git For Windows package. Go to the Git Bash download page, get the installer, run it, and follow the on-screen instruction. Make sure to check the Additional Icons -> On the Desktop to get direct access to Git Bash on your desktop so that you can quickly find and launch the app.
Alternatively, you can also use either Windows' Command Prompt or PowerShell. However, some commands may differ from the commands used in this tutorial.
Initializing a Git Repository
To start version-controlling a project, we need to initialize a new Git repository in the project's root folder or directory. In this tutorial, we'll use a sample project to facilitate the explanation. Go ahead and create a new folder in your file system. Then navigate to that folder in your terminal by running these commands:
$ mkdir sample_project
$ cd sample_project
The first command creates the project's root folder or directory, while the second command allows you to navigate into that folder. Don't close your terminal window. You'll be using it throughout the next sections.
To initialize a Git repository in this folder, we need to use the git init
command like in the example below:
$ git init
Initialized empty Git repository in /.../sample_project/.git/
This command creates a subfolder called .git
inside the project's folder. The leading dot in the folder's name means that this is a hidden directory. So, you may not see anything on your file manager. You can check the existence of .git
with the ls -a
, which lists all files in a given folder, including the hidden ones.
Checking the Status of Our Project
Git provides the git status
command to allow us to identify the current state of a Git repository. Because our sample_project
folder is still empty, running git status
will display something like this:
$ git status
On branch main
No commits yet
nothing to commit (create/copy files and use "git add" to track)
When we run git status
, we get detailed information about the current state of our Git repository. This command is pretty useful, and we'll turn back to it in multiple moments.
As an example of how useful the git status
command is, go ahead and create a file called main.py
inside the project's folder using the following commands:
$ touch main.py
$ git status
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
main.py
nothing added to commit but untracked files present (use "git add" to track)
With the touch
command, we create a new main.py
file under our project's folder. Then we run git status
again. This time, we get information about the presence of an untracked file called main.py
. We also get some basic instructions on how to add this file to our Git repo. Providing these guidelines or instructions is one of the neatest features of git status
.
Now, what is all that about untracked files? In the following section, we'll learn more about this topic.
Tracking and Committing Changes
A file in a Git repository can be either tracked or untracked. Any file that wasn't present in the last commit is considered an untracked file. Git doesn't keep a history of changes for untracked files in your project's folder.
In our example, we haven't made any commits to our Git repo, so main.py
is naturally untracked. To start tracking it, run the git add
command as follows:
$ git add main.py
$ git status
On branch main
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: main.py
This git add
command has added main.py
to the list of tracked files. Now it's time to save the file permanently using the git commit
command with an appropriate commit message provided with the -m
option:
$ git commit -m "Add main.py"
[main (root-commit) 5ac6586] Add main.py
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 main.py
$ git status
On branch master
nothing to commit, working tree clean
We have successfully made our first commit, saving main.py
to our Git repository. The git commit
command requires a commit message, which we can provide through the -m
option. Commit messages should clearly describe what we have changed in our project.
After the commit, our main
branch is completely clean, as you can conclude from the git status
output.
Now let's start the cycle again by modifying main.py
, staging the changes, and creating a new commit. Go ahead and run the following commands:
$ echo "print('Hello, World!')" > main.py
$ cat main.py
print('Hello, World!')
$ git add main.py
$ git commit -m "Create a 'Hello, World!' script on main.py"
[main 2f33f7e] Create a 'Hello, World!' script on main.py
1 file changed, 1 insertion(+)
The echo
command adds the statement "print('Hello, World!')"
to our main.py
file. You can confirm this addition with the cat
command, which lists the content of one or more target files. You can also open main.py
in your favorite editor and update the file there if you prefer.
We can also use the git stage
command to stage or add files to a Git repository and include them in our next commit.
We've made two commits to our Git repo. We can list our commit history using the git log
command as follows:
$ git log --oneline
2f33f7e (HEAD -> main) Create a 'Hello, World!' script on main.py
5ac6586 Add main.py
The git log
command allows us to list all our previous commits. In this example, we've used the --oneline
option to list commits in a single line each. This command takes us to a dedicated output space. To leave that space, we can press the letter Q
on our keyboard.
Using a .gitignore
File to Skip Unneeded Files
While working with Git, we will often have files and folders that we must not save to our Git repo. For example, most Python projects include a venv/
folder with a virtual environment for that project. Go ahead and create one with the following command:
$ python -m venv venv
Once we've added a Python virtual environment to our project's folder, we can run git status
again to check the repo state:
$ git status
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
venv/
nothing added to commit but untracked files present (use "git add" to track)
Now the venv/
folder appears as an untracked file in our Git repository. We don't need to keep track of this folder because it's not part of our project's codebase. It's only a tool for working on the project. So, we need to ignore this folder. To do that, we can add the folder to a .gitignore
file.
Go ahead and create a .gitignore
file in the project's folder. Add the venv/
folders to it and run git status
:
$ touch .gitignore
$ echo "venv/" > .gitignore
$ git status
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
nothing added to commit but untracked files present (use "git add" to track)
Now git status
doesn't list venv/
as an untracked file. This means that Git is ignoring that folder. If we take a look at the output, then we'll see that .gitignore
is now listed as an untracked file. We must commit our .gitignore
files to the Git repository. This will prevent other developers working with us from having to create their own local .gitignore
files.
We can also list multiple files and folders in our .gitignore
file one per line. The file even accepts glob patterns to match specific types of files, such as *.txt
. If you want to save yourself some work, then you can take advantage of GitHub's gitignore repository, which provides a rich list of predefined .gitignore
files for different programming languages and development environments.
We can also set up a global .gitignore
file on our computer. This global file will apply to all our Git repositories. If you decide to use this option, then go ahead and create a .gitignore_global
in your home folder.
Working With Branches in Git
One of the most powerful features of Git is that it allows us to create multiple branches. A branch is a copy of our project's current status and commits history. Having the option to create and handle branches allows us to make changes to our project without messing up the main line of development.
We'll often find that software projects maintain several independent branches to facilitate the development process. A common branch model distinguishes between four different types of branches:
- A
main
ormaster
branch that holds the main line of development - A
develop
branch that holds the last developments - One or more
feature
branches that hold changes intended to add new features - One or more
bugfix
branches that hold changes intended to fix critical bugs
However, the branching model to use is up to you. In the following sections, we'll learn how to manage branches using Git.
Creating New Branches
Working all the time on the main
or master
branch isn't a good idea. We can end up creating a mess and breaking the code. So, whenever we want to experiment with a new idea, implement a new feature, fix a bug, or just refactor a piece of code, we should create a new branch.
To kick things off, let's create a new branch called hello
on our Git repo under the sample_project
folder. To do that, we can use the git branch
command followed by the branch's name:
$ git branch hello
$ git branch --list
* main
hello
The first command creates a new branch in our Git repo. The second command allows us to list all the branches that currently exist in our repository. Again, we can press the letter Q
on our keyboard to get back to the terminal prompt.
The star symbol denotes the currently active branch, which is main
in the example. We want to work on hello
, so we need to activate that branch. In Git's terminology, we need to check out to hello
.
Checking Out to a New Branch
Although we have just created a new branch, in order to start working on it, we need to switch to or check out to it by using the git checkout
command as follows:
$ git checkout hello
Switched to branch 'hello'
$ git branch --list
main
* hello
$ git log --oneline
2f33f7e (HEAD -> hello, main) Create a 'Hello, World!' script on main.py
5ac6586 Add main.py
The git checkout
command takes the name of an existing branch as an argument. Once we run the command, Git takes us to the target branch.
We can derive a new branch from whatever branch we need.
As you can see, git branch --list
indicates which branch we are currently on by placing a *
symbol in front of the relevant branch name. If we check the commit history with git log --oneline
, then we'll get the same as we get from main
because hello
is a copy of it.
The git checkout
can take a -b
flag that we can use to create a new branch and immediately check out to it in a single step. That's what most developers use while working with Git repositories. In our example, we could have run git checkout -b hello
to create the hello
branch and check out to it with one command.
Let's make some changes to our project and create another commit. Go ahead and run the following commands:
$ echo "print('Welcome to PythonGUIs!')" >> main.py
$ cat main.py
print('Hello, World!')
print('Welcome to PythonGUIs!')
$ git add main.py
$ git commit -m "Extend our 'Hello, World' program with a welcome message."
[hello be62476] Extend our 'Hello, World' program with a welcome message.
1 file changed, 1 insertion(+)
The final command committed our changes to the hello
branch. If we compare the commit history of both branches, then we'll see the difference:
$ git log --oneline -1
be62476 (HEAD -> hello) Extend our 'Hello, World' program with a welcome message.
$ git checkout main
Switched to branch 'main'
$ git log --oneline -1
2f33f7e (HEAD -> main) Create a 'Hello, World!' script on main.py
In this example, we first run git log --oneline
with -1
as an argument. This argument tells Git to give us only the last commit in the active branch's commit history. To inspect the commit history of main
, we first need to check out to that branch. Then we can run the same git log
command.
Now say that we're happy with the changes we've made to our project in the hello
branch, and we want to update main
with those changes. How can we do this? We need to merge hello
into main
.
Merging Two Branches Together
To add the commits we've made in a separate branch back to another branch, we can run what is known as a merge. For example, say we want to merge the new commits in hello
into main
. In this case, we first need to switch back to main
and then run the git merge
command using hello
as an argument:
$ git checkout main
Already on 'main'
$ git merge hello
Updating 2f33f7e..be62476
Fast-forward
main.py | 1 +
1 file changed, 1 insertion(+)
To merge a branch into another branch, we first need to check out the branch we want to update. Then we can run git merge
. In the example above, we first check out to main
. Once there, we can merge hello
.
Deleting Unused Branches
Once we've finished working in a given branch, we can delete the entire branch to keep our repo as clean as possible. Following our example, now that we've merged hello
into main
, we can remove hello
.
To remove a branch from a Git repo, we use the git branch
command with the --delete
option. To successfully run this command, make sure to switch to another branch before:
$ git checkout main
Already on 'main'
$ git branch --delete hello
Deleted branch hello (was be62476).
$ git branch --list
* main
Deleting unused branches is a good way to keep our Git repositories clean, organized, and up to date. Also, deleting them as soon as we finish the work is even better because having old branches around may be confusing for other developers collaborating with our project. They might end up wondering why these branches are still alive.
Using a GUI Client for Git
In the previous sections, we've learned to use the git
command-line tool to manage Git repositories. If you prefer to use GUI tools, then you'll find a bunch of third-party GUI frontends for Git. While they won't completely replace the need for using the command-line tool, they can simplify your day-to-day workflow.
You can get a complete list of standalone GUI clients available on the Git official documentation.
Most popular IDEs and code editors, including PyCharm and Visual Studio Code, come with basic Git integration out-of-the-box. Some developers will prefer this approach as it is directly integrated with their development environment of choice.
If you need something more advanced, then GitKraken is probably a good choice. This tool provides a standalone, cross-platform GUI client for Git that comes with many additional features that can boost your productivity.
Managing a Project With GitHub
If we publish a project on a remote server with support for Git repositories, then anyone with appropriate permissions can clone our project, creating a local copy on their computer. Then, they can make changes to our project, commit them to their local copy, and finally push the changes back to the remote server. This workflow provides a straightforward way to allow other developers to contribute code to your projects.
In the following sections, we'll learn how to create a remote repository on GitHub and then push our existing local repository to it. Before we do that, though, head over to GitHub.com and create an account there if you don't have one yet. Once you have a GitHub account, you can set up the connection to that account so that you can use it with Git.
Setting Up a Secure Connection to GitHub
In order to work with GitHub via the git
command, we need to be able to authenticate ourselves. There are a few ways of doing that. However, using SSH is the recommended way. The first step in the process is to generate an SSH key, which you can do with the following command:
$ ssh-keygen -t ed25519 -C "GitHub - name@email.com"
Replace the placeholder email address with the address you've associated with your GitHub account. Once you run this command, you'll get three different prompts in a row. You can respond to them by pressing Enter to accept the default option. Alternatively, you can provide custom responses.
Next, we need to copy the contents of our id_ed25519.pub
file. To do this, you can run the following command:
$ cat ~/.ssh/id_ed25519.pub
Select the command's output and copy it. Then go to your GitHub Settings page and click the SSH and GPG keys option. There, select New SSH key, set a descriptive title for the key, make sure that the Key Type is set to Authentication Key, and finally, paste the contents of id_ed25519.pub
in the Key field. Finally, click the Add SSH key button.
At this point, you may be asked to provide some kind of Two-Factor Authentication (2FA) code. So, be ready for that extra security step.
Now we can test our connection by running the following command:
$ ssh -T git@github.com
The authenticity of host 'github.com (IP ADDRESS)' can not be established.
ECDSA key fingerprint is SHA256:p2QAMXNIC1TJYWeIOttrVc98/R1BUFWu3/LiyKgUfQM.
Are you sure you want to continue connecting (yes/no/[fingerprint])?
Make sure to check whether the key fingerprint shown on your output matches GitHub's public key fingerprint. If it matches, then enter yes and press Enter to connect. Otherwise, don't connect.
If the connection is successful, we will get a message like this:
Hi USERNAME! You have successfully authenticated, but GitHub does not provide shell access.
Congrats! You've successfully connected to GitHub via SSH using a secure SSH key. Now it's time to start working with GitHub.
Creating and Setting Up a GitHub Repository
Now that you have a GitHub account with a proper SSH connection, let's create a remote repository on GitHub using its web interface. Head over to the GitHub page and click the +
icon next to your avatar in the top-right corner. Then select New repository.
Give your new repo a unique name and choose who can see this repository. To continue with our example, we can give this repository the same name as our local project, sample_project
.
To avoid conflicts with your existing local repository, don't add .gitignore
, README
, or LICENSE
files to your remote repository.
Next, set the repo's visibility to Private so that no one else can access the code. Finally, click the Create repository button at the end of the page.
If you create a Public repository, make sure also to choose an open-source license for your project to tell people what they can and can't do with your code.
You'll get a Quick setup page as your remote repository has no content yet. Right at the top, you'll have the choice to connect this repository via HTTPS or SSH. Copy the SSH link and run the following command to tell Git where the remote repository is hosted:
$ git remote add origin git@github.com:USERNAME/sample_project.git
This command adds a new remote repository called origin
to our local Git repo.
The name origin
is commonly used to denote the main remote repository associated with a given project. This is the default name Git uses to identify the main remote repo.
Git allows us to add several remote repositories to a single local one using the git remote add
command. This allows us to have several remote copies of your local Git repo.
Pushing a Local Git Repository to GitHub
With a new and empty GitHub repository in place, we can go ahead and push the content of our local repo to its remote copy. To do this, we use the git push
command providing the target remote repo and the local branch as arguments:
$ git push -u origin main
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (9/9), 790 bytes | 790.00 KiB/s, done.
Total 9 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:USERNAME/sample_project.git
* [new branch] main -> main
branch 'main' set up to track 'origin/main'.
This is the first time we push something to the remote repo sample_project
, so we use the -u
option to tell Git that we want to set the local main
branch to track the remote main
branch. The command's output provides a pretty detailed summary of the process.
Note that if you don't add the -u
option, then Git will ask what you want to do. A safe workaround is to copy and paste the commands GitHub suggests, so that you don't forget -u
.
Using the same command, we can push any local branch to any remote copy of our project's repo. Just change the repo and branch name: git push -u remote_name branch_name
.
Now let's head over to our browser and refresh the GitHub page. We will see all of our project files and commit history there.
Now we can continue developing our project and making new commits locally. To push our commits to the remote main
branch, we just need to run git push
. This time, we don't have to use the remote or branch name because we've already set main
to track origin/main
.
Pulling Content From a GitHub Repository
We can do basic file editing and make commits within GitHub itself. For example, if we click the main.py
file and then click the pencil icon at the top of the file, we can add another line of code and commit those changes to the remote main
branch directly on GitHub.
Go ahead and add the statement print("Your Git Tutorial is Here...")
to the end of main.py
. Then go to the end of the page and click the Commit changes button. This makes a new commit on your remote repository.
This remote commit won't appear in your local commit history. To download it and update your local main
branch, use the git pull
command:
$ git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), 696 bytes | 174.00 KiB/s, done.
From github.com:USERNAME/sample_project
be62476..605b6a7 main -> origin/main
Updating be62476..605b6a7
Fast-forward
main.py | 1 +
1 file changed, 1 insertion(+)
Again, the command's output provides all the details about the operation. Note that git pull
will download the remote branch and update the local branch in a single step.
If we want to download the remote branch without updating the local one, then we can use the [git fetch](https://git-scm.com/docs/git-fetch)
command. This practice gives us the chance to review the changes and commit them to our local repo only if they look right.
For example, go ahead and update the remote copy of main.py
by adding another statement like print("Let's go!!")
. Commit the changes. Then get back to your local repo and run the following command:
$ git fetch
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), 731 bytes | 243.00 KiB/s, done.
From github.com:USERNAME/sample_project
605b6a7..ba489df main -> origin/main
This command downloaded the latest changes from origin/main
to our local repo. Now we can compare the remote copy of main.py
to the local copy. To do this, we can use the git diff
command as follows:
$ git diff main origin/main
diff --git a/main.py b/main.py
index be2aa66..4f0e7cf 100644
--- a/main.py
+++ b/main.py
@@ -1,3 +1,4 @@
print('Hello, World!')
print('Welcome to PythonGUIs!')
print("Your Git Tutorial is Here...")
+print("Let's go!!")
In the command's output, you can see that the remote branch adds a line containing print("Let's go!!")
to the end of main.py
. This change looks good, so we can use git pull
to commit the change automatically.
Exploring Alternatives to GitHub
While GitHub is the most popular public Git server and collaboration platform in use, it is far from being the only one. GitLab.com and BitBucket are popular commercial alternatives similar to GitHub. While they have paid plans, both offer free plans, with some restrictions, for individual users.
Although, if you would like to use a completely open-source platform instead, Codeberg might be a good option. It's a community-driven alternative with a focus on supporting Free Software. Therefore, in order to use Codeberg, your project needs to use a compatible open-source license.
Optionally, you can also run your own Git server. While you could just use barebones git
for this, software such as GitLab Community Edition (CE) and Forgejo provide you with both the benefits of running your own server and the experience of using a service like GitHub.
Conclusion
By now, you're able to use Git for version-controlling your projects. Git is a powerful tool that will make you much more efficient and productive, especially as the scale of your project grows over time.
While this guide introduced you to most of its basic concepts and common commands, Git has many more commands and options that you can use to be even more productive. Now, you know enough to get up to speed with Git.
Armin Ronacher
Lessons from a Pessimist: Make Your Pessimism Productive
This year I decided that I want to share my most important learnings about engineering, teams and quite frankly personal mental health. My hope is that those who want to learn from me find it useful.
I consider myself a functional and pragmatic pessimist. I tend to err on the side of anticipating the worst outcome most of the time. This mindset often leads me to assume that things are more difficult than they actually are, but it also highlights potential pitfalls along the way. In some ways, this is a coping mechanism, but it also aids in problem-solving and sets my expectations low, frequently resulting in pleasant surprises.
However, in recent years, I've more and more encountered a different kind of pessimism in others that I deem destructive. This type of pessimism sees no good in the world and renders people feeling powerless. I thought it might be worthwhile to share why I am not entirely consumed by gloom.
Destructive pessimism involves either wanting or expecting things to fail. At first glance, the aspect of not expecting success may appear similar to how I operate, but there's a subtle distinction. I generally anticipate that things will be challenging but still achievable, and when it matters, I want them to succeed. An extreme example of destructive pessimism on the other hand is expecting climate change to end the world and assuming society will do nothing to prevent it.
Whatever I personally do, I want it to be successful. I don't search for reasons why something won't work; instead, I focus on how to make it work while addressing or avoiding the issues I see along the way. That does not make me an optimist, that just makes me someone who wants to get stuff done and someone who strives for positive outcomes. On the other hand optimism to me is expecting to succeed against all odds, something I do not do. I fully expect that there will be failure along the way. (I also love venting about stuff I don't like even if it's not at all productive).
Many individuals in today's economy worry about their retirement and harbor a general negative sentiment about nearly everything, from the unfairness of the labor market and increasing poverty to climate change and more. Believe it or not, I share much of this negative sentiment, but I've learned never to let such thoughts govern my life. Dwelling on negativity regarding your employer, job prospects, government, economy, or environment — especially when it's difficult to influence these aspects — leads to nothing but unhappiness and depression.
Our times are marked by a number of transformative events. A recent conversation about AI I had with some folks I think is quite illustrative about how you can be a pessimist yet still be excited and forward looking. What's happening with AI at the moment makes a lot of people deeply uncomfortable. On the one hand some think that their job is at risk, others are trying to fight that future out of fear by attacking the foundations of it from all kinds of different angles. This fight comes from copyright law, various moral aspects as well as downplaying the status-quo capabilities of AI. All of these things are absolutely worth considering! You might remember from a recent blog post about AI that I myself posted something here that outlines some of the potential issues with AI. Nevertheless, AI will continue to advance, and being afraid of it is simply unproductive. Rather than becoming despondent about AI, my pessimistic side assumes that things can go wrong and acts accordingly, all while giving the technology a fair chance.
I am absolutely convinced that it's important to recognize the difference between a pragmatic form of pessimism and destructive pessimism. And as cheesy as it sounds, try to surround yourself with supportive individuals who can help you maintain a positive outlook and try to be that person for others. You don't have to be an optimist for wanting to succeed!
March 19, 2023
ListenData
ChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5
ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.
Update: I updated this article with reviews on GPT-4.
You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking interviews. After the release of GPT-4, I can confidently say that it will be a useful resource for exam preparation. GPT-4 has huge improvement over GPT-3.5. It can stun you with its ability to create human-like response with a very high precision on facts and creativity.
READ MORE »Brian Okken
Sharing is Caring - Sharing pytest Fixtures - PyCascades 2023
Slides and code and such for a talk for PyCascades 2023. Talk page: Sharing is Caring - Sharing pytest Fixtures Scheduled Time: Sunday, March 19, 11:30 am Summary: pytest rocks, obviously. When people start using pytest as a team, they often come up with cool fixtures that would be great to share across projects. In fact, many great Python packages come pre-loaded with pytest fixtures.
March 18, 2023
Glyph Lefkowitz
Building And Distributing A macOS Application Written in Python
Why Bother With All This?
In other words: if you want to run on an Apple platform, why not just write everything in an Apple programming language, like Swift? If you need to ship to multiple platforms, you might have to rewrite it all anyway, so why not give up?
Despite the significant investment that platform vendors make in their tools, I fundamentally believe that the core logic in any software application ought to be where its most important value lies. For small, independent developers, having portable logic that can be faithfully replicated on every platform without massive rework might be tricky to get started with, but if you can’t do it, it may not be cost effective to support multiple platforms at all.
So, it makes sense for me to write my applications in Python to achieve this sort of portability, even though on each platform it’s going to be a little bit more of a hassle to get it all built and shipped since the default tools don’t account for the use of Python.
But how much more is “a little bit” more of a hassle? I’ve been slowly learning about the pipeline to ship independently-distributed1 macOS applications for the last few years, and I’ve encountered a ton of annoying roadblocks.
Didn’t You Do This Already? What’s New?
So nice of you to remember. Thanks for asking. While I’ve gotten this to mostly work in the past, some things have changed since then:
- the notarization toolchain has been updated (
altool
is nownotarytool
), - I’ve had to ship libraries other than just PyGame,
- Apple Silicon launched, necessitating another dimension of build complexity to account for multiple architectures,
- Perhaps most significantly, I have written a tool that attempts to encode as much of this knowledge as possible, Encrust, which I have put on PyPI and GitHub. If this is of interest to you, I would encourage you to file bugs on it, and hopefully add in more corner cases which I have missed.
I’ve also recently shipped my first build of an end-user application that successfully launches on both Apple Silicon and Intel macs, so here is a brief summary of the hoops I needed to jump through, from the beginning, in order to make everything work.
Wait did you say you wrote a tool? Is this fixed, then?
Encrust is, I hope, a temporary stopgap on the way to a much better comprehensive solution.
Specifically, I believe that Briefcase is a much more holistic solution to the general problem being described here, but it doesn’t suit my very specific needs right now4, and it doesn’t address a couple of minor points that I was running into here.
It is mostly glue that is shelling out to other tools that already solve portions of the problem, even when better APIs exist. It addresses three very specific layers of complexity:
- It enforces architecture independence, so that your app built on an M1 machine will still actually run on about half of the macs remaining out there2.
- It remembers tricky nuances of the notarization submission process, such as
the highly specific way I need to generate my
zip
files to avoid mysterious notarization rejections3. - Providing a common and central way to store the configuration for these things across repositories so I don’t need to repeat this process and copy/paste a shell script every time I make a tiny new application.
It only works on Apple Silicon macs, because I didn’t bother to figure out how
pip
actually determines which architecture to download wheels for.
As such, unfortunately, Encrust is mostly a place for other people who have already solved this problem to collaborate to centralize this sort of knowledge and share ideas about where this code should ultimately go, rather than a tool for users trying to get started with shipping an app.
Open Offer
That said:
- I want to help Python developers ship their Python apps to users who are not also Python developers.
- macOS is an even trickier platform to do that on than most.
- It’s now easy for me to sign, notarize, and release new applications reliably
Therefore:
If you have an open source Python application that runs on macOS5 but can’t ship to macOS — either because:
- you’ve gotten stuck on one of the roadblocks that this post describes,
- you don’t have $100 to give to Apple, or because
- the app is using a cross-platform toolkit that should work just fine and you don’t have access to a mac at all, then
Send me an email and I’ll sign and post your releases.
What’s this post about, then?
People still frequently complain that “Python
packaging” is really bad. And I’m on record
that packaging Python (in the sense of “code”) for Python (in the sense of
“deployment platform”) is actually kind of fine right now; if what you’re
trying to get to is a package that can be pip install
ed, you can have a
reasonably good experience modulo a few small onboarding hiccups that are
well-understood in the community and fairly easy to overcome.
However, it’s still unfortunately hard to get Python code into the hands of users who are not also Python programmers with their own development environments.
My goal here is to document the difficulties themselves to try to provide a snapshot of what happens if you try to get started from scratch today. I think it is useful to record all the snags and inscrutable error messages that you will hit in a row, so we can see what the experience really feels like.
I hope that everyone will find it entertaining.
- Other Mac python programmers might find pieces of trivia useful, and
- Linux users will have fun making fun of the hoops we have to jump through on Apple platforms,
but the main audience is the maintainers of tools like Briefcase and py2app to evaluate the new-user experience holistically, and to see how much the use of their tools feels like this. This necessarily includes the parts of the process that are not actually packaging.
This is why I’m starting from the beginning again, and going through all the stuff that I’ve discussed in previous posts again, to present the whole experience.
Here Goes
So, with no further ado, here is a non-exhaustive list of frustrations that I have encountered in this process:
- Okay. Time to get started. How do I display a GUI at all? Nothing happens when I call some nominally GUI API. Oops: I need my app to exist in an app bundle, which means I need to have a framework build. Time to throw those partially-broken pyenv pythons in the trash; best to use the official python.org from here on out.
- Bonus Frustration since I’m using AppKit directly: why is my app
segfaulting all the time? Oh,
target
is a weak reference in objective C, so if I make a window and put a button in it that points at a Python object, the Python interpreter deallocates it immediately because only the window (which is “nothing” as it’s a weakref) is referring to it. I need to start stuffing every Python object that talks to a UI element like a window or a button into a global list, or manually calling.retain()
on all of them and hoping I don’t leak memory. - Everything seems to be using the default Python Launcher icon, and the app menu says “Python”. That wouldn’t look too good to end users. I should probably have my own app.
- I’ll skip the part here where the author of a new application might have to investigate py2app, briefcase, pyoxidizer, and pyinstaller separately and try to figure out which one works the best right now. As I said above, I started with py2app and I’m stubborn to a fault, so that is the one I’m going to make work.
- Now I need to set up py2app. Oops, I can’t use
pyproject.toml
any more, time to go back tosetup.py
. - Now I built it and the the app is crashing on startup when I click on it. I
can’t see a traceback anywhere, so I guess I need to do something in the
console.
- Wow; the console is an unusable flood of useless garbage. Forget that.
- I guess I need to run it in the terminal somehow. After some googling I
figure out it’s
./dist/MyApp.app/Contents/Resources/MacOS/MyApp
. Aha, okay, I can see the traceback now, and it’s … an import error? - Ugh, py2app isn’t actually including all of my code, it’s using some
magic to figure out which modules are actually used, but it’s doing it
by traversing
import
statements, which means I need to put a bunch of fake staticimport
statements for everything that is used indirectly at the top of my app’s main script so that it gets found by the build. I experimentally discover a half a dozen things that are dynamically imported inside libraries that I use and jam them all in there.
- Okay. Now at least it starts up. The blank app icon is uninspiring, though,
time to actually get my own icon in there. Cool, I’ll make an icon in my
favorite image editor, and save it as... icons must be PNGs, right?
Uhh... no, looks like they have to be
.icns
files. But luckily I can convert the PNG I saved with a simple 12-line shell script that invokessips
andiconutil
6.
At this point I have an app bundle which kinda works. But in order to run on anyone else’s computer, I have to code-sign it.
- In order to code-sign anything, I have to have an account with Apple that costs $99 per year, on developer.apple.com.
- The easiest way to get these certificates is to log in to Xcode itself.
There’s a web
portal too
but using it appears to involve a lot more manual management of key material,
so, no thanks. This requires the full-fat Xcode.app though, not just the
command-line tools that come down when I run
xcode-select --install
, so, time to wait for an 11GB download. - Oops, I made the wrong certificate type. Apparently the only right answer here is a “Developer ID Application” certificate.
- Now that I’ve logged in to Xcode to get the certificate, I need to figure out
how to tell my command-line tools about it (for starters, “
codesign
”). Looks like I need to runsecurity find-identity -v -p codesigning
. - Time to sign the application’s code.
- The
codesign
tool has a--deep
option which can sign the whole bundle. Great! - Except, that doesn’t work, because Python ships shared libraries in
locations that macOS doesn’t automatically expect, so I have to manually
locate those files and sign them, invoking
codesign
once for each. - Also,
--deep
is deprecated. There’s no replacement. - Logically, it seems like I still need
--deep
, because it does some poorly-explained stuff with non-code resource files that maybe doesn’t happen properly if I don’t? Oh well. Let's drop the option and hope for the best.8 - With a few heuristics I think we can find all the relevant files with a little script7.
- The
Now my app bundle is signed! Hooray. 12 years ago, I’d be all set. But today I need some additional steps.
- After I sign my app, Apple needs to sign my app (to indicate they’ve checked
it for malware), which is called “notarization”.
- In order to be eligible for notarization, I can’t just code-sign my app. I have to code-sign it with entitlements.
- Also I can’t just code sign it with entitlements, I have to sign it with the hardened runtime, or it fails notarization.
- Oops, out of the box, the hardened runtime is incompatible with a bunch
of stuff in Python, including cffi and ctypes, because nobody has
implemented support for
MAP_JIT
yet, so it crashes at startup. After some thrashing around I discover that I need a legacy “allow unsigned executable memory” entitlement. I can’t avoid importing this because a bunch of things in py2app’s bootstrapping code import things that use ctypes, and probably a bunch of packages which I’m definitely going to need, likecryptography
requirecffi
directly anyway. - In order to set up notarization external to Xcode, I need to create an App Password which is set up at appleid.apple.com, not the developer portal.
- Bonus Frustration since I’ve been doing this for a few years:
Originally this used to be even more annoying as I needed to wait for an
email (with
altool
), and so I couldn’t script it directly. Now, at least, the newnotarytool
(which will shortly be mandatory) has a--wait
flag. - Although the tool is documented under
man notarytool
, I actually have to run it asxcrun notarytool
, even thoughcodesign
can be run either directly or viaxcrun codesign
. - Great, we’re ready to zip up our app and submit to Apple. Wait, they’re rejecting it? Why???
- Aah, I need to manually copy and paste the UUID in the console output
of
xcrun notarytool submit
intoxcrun notarytool log
to get some JSON that has some error messages embedded in it. - Oh. The bundle contains internal symlinks, so when I zipped it without
the
-y
option, I got a corrupt archive. - Great, resubmitted with
zip -y
. - Oops, just kidding, that only works sometimes. Later, a different submission with a different hash will fail, and I’ll learn that the correct command line is actually
ditto -c -k --sequesterRsrc --keepParent MyApp.app MyApp.app.zip
.- Note that, for extra entertainment value, the position of the archive
itself and directory are reversed on the command line from
zip
(andtar
, and every other archive tool).
- Note that, for extra entertainment value, the position of the archive
itself and directory are reversed on the command line from
notarytool
doesn’t record anything in my app though; it puts the “notarization ticket” on Apple's servers. Apparently, I still need to runstapler
for users to be able to launch it while those servers are inaccessible, like, for example, if they’re offline.- Oops, not
stapler
.xcrun stapler
. Whatever. - Except
notarytool
operates on a zip archive, butstapler
operates on an app bundle. So we have to save the original app bundle, runstapler
on it, then re-archive the whole thing into a new archive.
Hooray! Time to release my great app!
- Whoops, just got a bug report that it crashes immediately on every Intel mac. What’s going on?
- Turns out I’m using a library whose authors distribute both
aarch64
andx86_64
wheels;pip
will prefer single-architecture wheels even ifuniversal2
wheels are also available, so I’ve got to somehow get fat binaries put together. Am I going to have to build a huge pile of C code by myself? I thought all these Python hassles would at least let me avoid the C hassles! -
Whew, okay, no need for that: there’s an amazing Swiss-army knife for macOS binary wheels, called
delocate
that includes adelocate-fuse
tool that can fuse two wheels together. So I just need to figure out which binaries are the wrong architecture and somehow install my fixed/fused wheels before building my app withpy2app
.- except, oops, this tool just rewrites the file in-place without even changing its name, so I have to write some janky shell scripts to do the reinstallation. Ugh.
-
OK now that all that is in place, I just need to re-do all the steps:
universal2
-ize my virtualenv!- build!
- sign!
- archive!
- notarize!
- wait!!!
- staple!
- re-archive!
- upload!
And we have an application bundle we can ship to users.
It’s just that easy.
As long as I don’t need sandboxing or Mac App Store distribution, of course. That’s a challenge for another day.
So, that was terrible. But what should be happening here?
Some of this is impossible to simplify beyond a certain point - many of the things above are not really about Python, but are about distribution requirements for macOS specifically, and we in the Python community can’t affect operating system vendors’ tooling.
What we can do is build tools that produce clear guidance on what step is required next, handle edge cases on their own, and generally guide users through these complex processes without requiring them to hit weird binary-format or cryptographic-signing errors on their own with no explanation of what to do next.
I do not think that documentation is the answer here. The necessary steps
should be discoverable. If you need to go to a website, the tool should use
the
webbrowser
module to open a website. If you need to launch an app, the tool should launch
that app.
With Encrust
, I am hoping to generalize
the solutions that I found while working on this for this one specific slice of
the app distribution pipeline — i.e. a macOS desktop application desktop, as
distributed independently and not through the mac app store — but other
platforms will need the same treatment.
However, even without really changing py2app
or any of the existing tooling,
we could imagine a tool that would interactively prompt the user for each
manual step, automate as much of it as possible, verify that it was performed
correctly, and give comprehensible error messages if it was not.
For a lot of users, this full code-signing journey may not be necessary; if you just want to run your code on one or two friends’ computers, telling them to right click, go to ‘open’ and enter their password is not too bad. But it may not even be clear to them what the trade-off is, exactly; it looks like the app is just broken when you download it. The app build pipeline should tell you what the limitations are.
Other parts of this just need bug-fixes to address. py2app specifically, for example, could have a better self-test for its module-collecting behavior, launching an app to make sure it didn’t leave anything out.
Interactive prompts to set up a Homebrew tap, or a Flatpak build, or a Microsoft Store Metro app, might be similarly useful. These all have outside-of-Python required manual steps, and all of them are also amenable to at least partial automation.
Thanks to my patrons on Patreon for supporting this sort of work, including development of Encrust, of Pomodouroboros, of posts like this one and of that offer to sign other people’s apps. If you think this sort of stuff is worthwhile, you might want to consider supporting me over there as well.
-
I am not even going to try to describe building a sandboxed, app-store ready application yet. ↩
-
At least according to the Steam Hardware Survey, which as of this writing in March of 2023 pegs the current user-base at 54% apple silicon and 46% Intel. The last version I can convince the Internet Archive to give me, from December of 2022, has it closer to 51%/49%, which suggests a transition rate of 1% per month. I suspect that this is pretty generous to Apple Silicon as Steam users would tend to be earlier adopters and more sensitive to performance, but mostly I just don’t have any other source of data. ↩
-
It is truly remarkable how bad the error reporting from the notarization service is. There are dozens of articles and forum posts around the web like this one where someone independently discovers this failure mode after successfully notarizing a dozen or so binaries and then suddenly being unable to do so any more because one of the bytes in the signature is suddenly not valid UTF-8 or something. ↩
-
A lot of this is probably historical baggage; I started with py2app in 2008 or so, and I have been working on these apps in fits and starts for… ugh… 15 years. At some point when things are humming along and there are actual users, a more comprehensive retrofit of the build process might make sense but right now I just want to stop thinking about this. ↩
-
If your application isn’t open source, or if it requires some porting work, I’m also available for light contract work, but it might take a while to get on my schedule. Feel free to reach out as well, but I am not looking to spend a lot of time doing porting work. ↩
-
I find this particular detail interesting; it speaks to the complexity and depth of this problem space that this has been a known issue for several years in Briefcase, but there’s just so much other stuff to handle in the release pipeline that it remains open. ↩
-
I forgot both
.a
files and the py2app-includedpython
executable itself here, and had to discover that gap when I signed a different app where that made a difference. ↩ -
Thus far, it seems to be working. ↩
Hynek Schlawack
How to Automatically Switch to Rosetta With Fish and Direnv
I love my Apple silicon computer, but having to manually switch to Rosetta-enabled shells for my Intel-only projects was a bummer.
Talk Python to Me
#407: pytest tips and tricks for better testing
If you're like most people, the simplicity and easy of getting started is a big part of pytest's appeal. But beneath that simplicity, there is a lot of power and depth. We have Brian Okken on this episode to dive into his latest pytest tips and tricks for beginners and power users.<br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>pytest tips and tricks article</b>: <a href="https://pythontest.com/pytest-tips-tricks/" target="_blank" rel="noopener">pythontest.com</a><br/> <b>Getting started with pytest Course</b>: <a href="https://training.talkpython.fm/courses/getting-started-with-testing-in-python-using-pytest" target="_blank" rel="noopener">training.talkpython.fm</a><br/> <b>pytest book</b>: <a href="https://pythontest.com/pytest-book/" target="_blank" rel="noopener">pythontest.com</a><br/> <b>Python Bytes podcast</b>: <a href="https://pythonbytes.fm" target="_blank" rel="noopener">pythonbytes.fm</a><br/> <b>Brian on Mastodon</b>: <a href="https://fosstodon.org/@brianokken" target="_blank" rel="noopener">@brianokken@fosstodon.org</a><br/> <br/> <b>Hypothesis</b>: <a href="https://hypothesis.readthedocs.io/en/latest/" target="_blank" rel="noopener">readthedocs.io</a><br/> <b>Hypothesis: Reproducability</b>: <a href="https://hypothesis.readthedocs.io/en/latest/reproducing.html" target="_blank" rel="noopener">readthedocs.io</a><br/> <b>Get More Done with the DRY Principle</b>: <a href="https://zapier.com/blog/dont-repeat-yourself/" target="_blank" rel="noopener">zapier.com</a><br/> <b>"The Key" Keyboard</b>: <a href="https://stackoverflow.blog/2021/03/31/the-key-copy-paste/" target="_blank" rel="noopener">stackoverflow.blog</a><br/> <b>pytest plugins</b>: <a href="https://docs.pytest.org/en/7.1.x/reference/plugin_list.html" target="_blank" rel="noopener">docs.pytest.org</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=qQ6b7OwT124" target="_blank" rel="noopener">youtube.com</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/407/pytest-tips-and-tricks-for-better-testing" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div><br/> <strong>Sponsors</strong><br/> <a href='https://talkpython.fm/foundershub'>Microsoft Founders Hub 2023</a><br> <a href='https://talkpython.fm/brilliant'>Brilliant 2023</a><br> <a href='https://talkpython.fm/training'>Talk Python Training</a>
Matt Layman
Locking Down Your Users' Secrets: Django Sessions 101
Django is a powerful and popular web framework that makes it easy to build robust and secure web applications. One of the key features of Django is its ability to manage user sessions, which are essential for many web applications. However, you may be wondering if Django sessions are secure. In this article, we’ll explore the security of Django sessions and see how they can be made even more secure.
March 17, 2023
Stack Abuse
DBSCAN with Scikit-Learn in Python
Introduction
You are working in a consulting company as a data scientis. The project you were currently assigned to has data from students who have recently finished courses about finances. The financial company that conducts the courses wants to understand if there are common factors that influence students to purchase the same courses or to purchase different courses. By understanding those factors, the company can create a student profile, classify each student by profile and recommend a list of courses.
When inspecting data from different student groups, you've come across three dispositions of points, as in 1, 2 and 3 below:
Notice that in plot 1, there are purple points organized in a half circle, with a mass of pink points inside that circle, a little concentration of orange points outside of that semi-circle, and five gray points that are far from all others.
In plot 2, there is a round mass of purple points, another of orange points, and also four gray points that are far from all the others.
And in plot 3, we can see four concentrations of points, purple, blue, orange, pink, and three more distant gray points.
Now, if you were to choose a model that could understand new student data and determine similar groups, is there a clustering algorithm that could give interesting results to that kind of data?
When describing the plots, we mentioned terms like mass of points and concentration of points, indicating that there are areas in all graphs with greater density. We also referred to circular and semi-circular shapes, which are difficult to identify by drawing a straight line or merely examining the closest points. Additionally, there are some distant points that likely deviate from the main data distribution, introducing more challenges or noise when determining the groups.
A density-based algorithm that can filter out noise, such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise), is a strong choice for situations with denser areas, rounded shapes, and noise.
About DBSCAN
DBSCAN is one of the most cited algorithms in research, it's first publication appears in 1996, this is the original DBSCAN paper. In the paper, researchers demonstrate how the algorithm can identify non-linear spatial clusters and handle data with higher dimensions efficiently.
The main idea behind DBSCAN is that there is a minimum number of points that will be within a determined distance or radius from the most "central" cluster point, called core point. The points within that radius are the neighborhood points, and the points on the edge of that neighborhood are the border points or boundary points. The radius or neighborhood distance is called epsilon neighborhood, ε-neighborhood or just ε (the symbol for Greek letter epsilon).
Additionally, when there are points that aren't core points or border points because they exceed the radius for belonging to a determined cluster and also don't have the minimum number of points to be a core point, they are considered noise points.
This means we have three different types of points, namely, core, border and noise. Furthermore, it is important to note that the main idea is fundamentally based on a radius or distance, which makes DBSCAN - like most clustering models - dependent on that distance metric. This metric could be Euclidean, Manhattan, Mahalanobis, and many more. Therefore, it is crucial to choose an appropriate distance metric that considers the context of the data. For instance, if you are using driving distance data from a GPS, it might be interesting to use a metric that takes the street layouts into consideration, such as Manhattan distance.
Note: Since DBSCAN maps the points that constitute noise, it can also be used as an outlier detection algorithm. For instance, if you are trying to determine which bank transactions may be fraudulent and the rate of fraudulent transactions is small, DBSCAN might be a solution to identify those points.
To find the core point, DBSCAN will first select a point at random, map all the points within its ε-neighborhood, and compare the number of neighbors of the selected point to the minimum number of points. If the selected point has an equal number or more neighbors than the minimum number of points, it will be marked as a core point. This core point and its neighborhood points will constitute the first cluster.
The algorithm will then examine each point of the first cluster and see if it has an equal number or more neighbor points than the minimum number of points within ε. If it does, those neighbor points will also be added to the first cluster. This process will continue until the points of the first cluster have fewer neighbors than the minimum number of points within ε. When that happens, the algorithm stops adding points to that cluster, identifies another core point outside of that cluster, and creates a new cluster for that new core point.
DBSCAN will then repeat the first cluster process of finding all points connected to a new core point of the second cluster until there are no more points to be added to that cluster. It will then encounter another core point and create a third cluster, or it will iterate through all the points that it hasn't previously looked at. If these points are at ε distance from a cluster, they are added to that cluster, becoming border points. If they aren't, they are considered noise points.
Advice: There are many rules and mathematical demonstrations involved in the idea behind DBSCAN, if you want to dig deeper, you may want to take a look at the original paper, which is linked above.
It is interesting to know how the DBSCAN algorithm works, although, fortunately, there is no need to code the algorithm, once Python's Scikit-Learn library already has an implementation.
Let's see how it works in practice!
Importing Data for Clustering
To see how DBSCAN works in practice, we will change projects a bit and use a small customer dataset that has the genre, age, annual income, and spending score of 200 customers.
The spending score ranges from 0 to 100 and represents how often a person spends money in a mall on a scale from 1 to 100. In other words, if a customer has a score of 0, they never spend money, and if the score is 100, they are the highest spender.
Note: You can download the dataset here.
After downloading the dataset, you will see that it is a CSV (comma-separated values) file called shopping-data.csv, we'll load it into a DataFrame using Pandas and store it into the customer_data
variable:
import pandas as pd
# Substitute the path_to_file content by the path to your csv file
path_to_file = '../../datasets/dbscan/dbscan-with-python-and-scikit-learn-shopping-data.csv'
customer_data = pd.read_csv(path_to_file)
To take a look at the first five rows of our data, you can execute customer_data.head()
:
This results in:
CustomerID Genre Age Annual Income (k$) Spending Score (1-100)
0 1 Male 19 15 39
1 2 Male 21 15 81
2 3 Female 20 16 6
3 4 Female 23 16 77
4 5 Female 31 17 40
By examining the data, we can see customer ID numbers, genre, age, incomes in k$, and spending scores. Keep in mind that some or all of these variables will be used in the model. For example, if we were to use Age
and Spending Score (1-100)
as variables for DBSCAN, which uses a distance metric, it's important to bring them to a common scale to avoid introducing distortions since Age
is measured in years and Spending Score (1-100)
has a limited range from 0 to 100. This means that we will perform some kind of data scaling.
We can also check if the data needs any more preprocessing aside from scaling by seeing if the type of data is consistent and verifying if there are any missing values that need to be treated by executing Panda's info()
method:
customer_data.info()
This displays:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 200 non-null int64
1 Genre 200 non-null object
2 Age 200 non-null int64
3 Annual Income (k$) 200 non-null int64
4 Spending Score (1-100) 200 non-null int64
dtypes: int64(4), object(1)
memory usage: 7.9+ KB
We can observe that there are no missing values because there are 200 non-null entries for each customer feature. We can also see that only the genre column has text content, as it is a categorical variable, which is displayed as object
, and all other features are numeric, of the type int64
. Thus, in terms of data type consistency and absence of null values, our data is ready for further analysis.
We can proceed to visualize the data and determine which features would be interesting to use in DBSCAN. After selecting those features, we can scale them.
This customer dataset is the same as the one used in our definitive guide to hierarchical clustering. To learn more about this data, how to explore it, and about distance metrics, you can take a look at Definitive Guide to Hierarchical Clustering with Python and Scikit-Learn!
Visualizing Data
By using Seaborn's pairplot()
, we can plot a scatter graph for each combination of features. Since CustomerID
is just an identification and not a feature, we will remove it with drop()
prior to plotting:
import seaborn as sns
# dropping CustomerID column from data
customer_data = customer_data.drop('CustomerID', axis=1)
sns.pairplot(customer_data);
This outputs:
When looking at the combination of features produced by pairplot
, the graph of Annual Income (k$)
with Spending Score (1-100)
seems to display around 5 groups of points. This seems to be the most promising combination of features. We can create a list with their names, select them from the customer_data
DataFrame, and store the selection in the customer_data
variable again for use in our future model.
selected_cols = ['Annual Income (k$)', 'Spending Score (1-100)']
customer_data = customer_data[selected_cols]
After selecting the columns, we can perform the scaling discussed in the previous section. To bring the features to the same scale or standardize them, we can import Scikit-Learn's StandardScaler
, create it, fit our data to calculate its mean and standard deviation, and transform the data by subtracting its mean and dividing it by the standard deviation. This can be done in one step with the fit_transform()
method:
from sklearn.preprocessing import StandardScaler
ss = StandardScaler() # creating the scaler
scaled_data = ss.fit_transform(customer_data)
The variables are now scaled, and we can examine them by simply printing the content of the scaled_data
variable. Alternatively, we can also add them to a new scaled_customer_data
DataFrame along with column names and use the head()
method again:
scaled_customer_data = pd.DataFrame(columns=selected_cols, data=scaled_data)
scaled_customer_data.head()
This outputs:
Annual Income (k$) Spending Score (1-100)
0 -1.738999 -0.434801
1 -1.738999 1.195704
2 -1.700830 -1.715913
3 -1.700830 1.040418
4 -1.662660 -0.395980
This data is ready for clustering! When introducing DBSCAN, we mentioned the minimum number of points and the epsilon. These two values need to be selected prior to creating the model. Let's see how it's done.
Choosing Min. Samples and Epsilon
To choose the minimum number of points for DBSCAN clustering, there is a rule of thumb, which states that it has to be equal or higher than the number of dimensions in the data plus one, as in:
$$
\text{min. points} >= \text{data dimensions} + 1
$$
The dimensions are the number of columns in the dataframe, we are using 2 columns, so the min. points should be either 2+1, which is 3, or higher. For this example, let's use 5 min. points.
$$
\text{5 (min. points)} >= \text{2 (data dimensions)} + 1
$$
Now, to choose the value for ε there is a method in which a Nearest Neighbors algorithm is employed to find the distances of a predefined number of nearest points for each point. This predefined number of neighbors is the min. points we have just chosen minus 1. So, in our case, the algorithm will find the 5-1, or 4 nearest points for each point of our data. those are the k-neighbors and our k equals 4.
$$
\text{k-neighbors} = \text{min. points} - 1
$$
Advice: to learn more about Nearest Neighbors, read our K Nearest Neighbors algorithm in Python and Scikit-learn guide.
After finding the neighbors, we will order their distances from largest to smallest and plot the distances of the y-axis and the points on the x-axis. Looking at the plot, we will find where it resembles the bent of an elbow and the y-axis point that describes that elbow bent is the suggested ε value.
Note: it is possible that the graph for finding the ε value has either one or more "elbow bents", either big or mini, when that happens, you can find the values, test them and choose those with results that best describe the clusters, either by looking at metrics of plots.
To perform these steps, we can import the algorithm, fit it to the data, and then we can extract the distances and indices of each point with kneighbors()
method:
from sklearn.neighbors import NearestNeighbors
import numpy as np
nn = NearestNeighbors(n_neighbors=4) # minimum points -1
nbrs = nn.fit(scaled_customer_data)
distances, indices = nbrs.kneighbors(scaled_customer_data)
Note: Just like with DBSCAN, it is essential to choose a distance metric that suits your data when using the Nearest Neighbors algorithm, as it is also distance-based.
For a list of some metrics and explanations on when to use them, you can take a look at Definitive Guide to Hierarchical Clustering with Python and Scikit-Learn.
After finding the distances, we can sort them from largest to smallest. Since the distances array's first column is of the point to itself (meaning all are 0), and the second column contains the smallest distances, followed by the third column which has larger distances than the second, and so on, we can pick only the values of the second column and store them in the distances
variable:
distances = np.sort(distances, axis=0)
distances = distances[:,1] # Choosing only the smallest distances
Now that we have our sorted smallest distances, we can import matplotlib
, plot the distances, and draw a red line on where the "elbow bend" is:
import matplotlib.pyplot as plt
plt.figure(figsize=(6,3))
plt.plot(distances)
plt.axhline(y=0.24, color='r', linestyle='--', alpha=0.4) # elbow line
plt.title('Kneighbors distance graph')
plt.xlabel('Data points')
plt.ylabel('Epsilon value')
plt.show();
This is the result:
Notice that when drawing the line, we will find out the ε value, in this case, it is 0.24.
We finally have our minimum points and ε. With both variables, we can create and run the DBSCAN model.
Creating a DBSCAN Model
To create the model, we can import it from Scikit-Learn, create it with ε which is the same as the eps
argument, and the minimum points to which is the mean_samples
argument. We can then store it into a variable, let's call it dbs
and fit it to the scaled data:
from sklearn.cluster import DBSCAN
# min_samples == minimum points ≥ dataset_dimensions + 1
dbs = DBSCAN(eps=0.24, min_samples=5)
dbs.fit(scaled_customer_data)
Just like that, our DBSCAN model has been created and trained on the data! To extract the results, we access the labels_
property. We can also create a new labels
column in the scaled_customer_data
dataframe and fill it with the predicted labels:
labels = dbs.labels_
scaled_customer_data['labels'] = labels
scaled_customer_data.head()
This is the final result:
Annual Income (k$) Spending Score (1-100) labels
0 -1.738999 -0.434801 -1
1 -1.738999 1.195704 0
2 -1.700830 -1.715913 -1
3 -1.700830 1.040418 0
4 -1.662660 -0.395980 -1
Observe that we have labels with -1 values; these are the noise points, the ones that don't belong to any cluster. To know how many noise points the algorithm found, we can count how many times the value -1 appears in our labels list:
labels_list = list(scaled_customer_data['labels'])
n_noise = labels_list.count(-1)
print("Number of noise points:", n_noise)
This outputs:
Number of noise points: 62
We already know that 62 points of our original data of 200 points were considered noise. This is a lot of noise, which indicates that perhaps the DBSCAN clustering didn't consider many points as part of a cluster. We will understand what happened soon, when we plot the data.
Initially, when we observed the data, it seemed to have 5 clusters of points. To know how many clusters DBSCAN has formed, we can count the number of labels that are not -1. There are many ways to write that code; here, we have written a for loop, which will also work for data in which DBSCAN has found many clusters:
total_labels = np.unique(labels)
n_labels = 0
for n in total_labels:
if n != -1:
n_labels += 1
print("Number of clusters:", n_labels)
This outputs:
Number of clusters: 6
We can see that the algorithm predicted the data to have 6 clusters, with many noise points. Let's visualize that by plotting it with seaborn's scatterplot
:
sns.scatterplot(data=scaled_customer_data,
x='Annual Income (k$)', y='Spending Score (1-100)',
hue='labels', palette='muted').set_title('DBSCAN found clusters');
This results in:
Looking at the plot, we can see that DBSCAN has captured the points which were more densely connected, and points that could be considered part of the same cluster were either noise or considered to form another smaller cluster.
If we highlight the clusters, notice how DBSCAN gets cluster 1 completely, which is the cluster with less space between points. Then it gets the parts of clusters 0 and 3 where the points are closely together, considering more spaced points as noise. It also considers the points in the lower left half as noise and splits the points in the lower right into 3 clusters, once again capturing clusters 4, 2, and 5 where the points are closer together.
We can start to come to a conclusion that DBSCAN was great for capturing the dense areas of the clusters but not so much for identifying the bigger scheme of the data, the 5 clusters' delimitations. It would be interesting to test more clustering algorithms with our data. Let's see if a metric will corroborate this hypothesis.
Evaluating the Algorithm
To evaluate DBSCAN we will use the silhouette score which will take into consideration the distance between points of a same cluster and the distances between clusters.
Note: Currently, most clustering metrics aren't really fitted to be used to evaluate DBSCAN because they aren't based on density. Here, we are using the silhouette score because it is already implemented in Scikit-learn and because it tries to look at cluster shape.
To have a more fitted evaluation, you can use or combine it with the Density-Based Clustering Validation (DBCV) metric, which was designed specifically for density-based clustering. There is an implementation for DBCV available on this GitHub.
First, we can import silhouette_score
from Scikit-Learn, then, pass it our columns and labels:
from sklearn.metrics import silhouette_score
s_score = silhouette_score(scaled_customer_data, labels)
print(f"Silhouette coefficient: {s_score:.3f}")
This outputs:
Silhouette coefficient: 0.506
According to this score, it seems DBSCAN could capture approximately 50% of the data.
Conclusion
DBSCAN Advantages and Disadvantages
DBSCAN is a very unique clustering algorithm or model.
If we look at its advantages, it is very good at picking up dense areas in data and points that are far from others. This means that the data doesn't have to have a specific shape and can be surrounded by other points, as long as they are also densely connected.
It requires us to specify minimum points and ε, but there is no need to specify the number of clusters beforehand, as in K-Means, for instance. It can also be used with very large databases since it was designed for high-dimensional data.
As for its disadvantages, we have seen that it couldn't capture different densities in the same cluster, so it has a hard time with large differences in densities. It is also dependent on the distance metric and scaling of the points. This means that if the data isn't well understood, with differences in scale and with a distance metric that doesn't make sense, it will probably fail to understand it.
DBSCAN Extensions
There are other algorithms, such as Hierarchical DBSCAN (HDBSCAN) and Ordering points to identify the clustering structure (OPTICS), which are considered extensions of DBSCAN.
Both HDBSCAN and OPTICS can usually perform better when there are clusters of varying densities in the data and are also less sensitive to the choice or initial min. points and ε parameters.
Mike Driscoll
Python’s Built-in Functions – The all() Function (Video)
This is the next video in my Python Built-ins Series.
Did you know Python has an all() function? Do you know what to use the all() function for?
Find out today by watching this short video!
More Videos in the series
The post Python’s Built-in Functions – The all() Function (Video) appeared first on Mouse Vs Python.
Python for Beginners
Pandas DataFrame to List in Python
Python lists and dataframes are two of the most used data structures in python. While we use python lists to handle sequential data, dataframes are used to handle tabular data. In this article, we will discuss different ways to convert pandas dataframe to list in python.
Convert Pandas DataFrame to a List of Rows
Each row in a pandas dataframe is stored as a series object with column names of the dataframe as the index and the values of the rows as associated values.
To convert a dataframe to a list of rows, we can use the iterrows()
method and a for loop. The iterrows()
method, when invoked on a dataframe, returns an iterator. The iterator contains all the rows as a tuple having the row index as the first element and a series containing the row data as its second element. We can iterate through the iterator to access all the rows.
To create a list of rows from the dataframe using the iterator, we will use the following steps.
- First, we will create an empty list to store the rows. Let us name it
rowList
. - Next, we will iterate through the rows of the dataframe using the
iterrows()
method and a for loop. - While iterating over the rows, we will add them to
rowList
. For this, we will use theappend()
method. Theappend()
method, when invoked on the list, takes the current row as its input argument and adds the row to the list.
After execution of the for loop, we will get the output list of rows. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list()
for index,row in df.iterrows():
rowList.append(row)
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[Roll 1
Maths 100
Physics 80
Chemistry 90
Name: 0, dtype: int64, Roll 2
Maths 80
Physics 100
Chemistry 90
Name: 1, dtype: int64, Roll 3
Maths 90
Physics 80
Chemistry 70
Name: 2, dtype: int64, Roll 4
Maths 100
Physics 100
Chemistry 90
Name: 3, dtype: int64]
In this example, you can observe that we have created a list of rows from the dataframe. You can observe that the elements of the list are series objects and not arrays representing the rows.
Pandas DataFrame to List of Arrays in Python
Instead of creating a list of rows, we can create a list of arrays containing the values in rows from the dataframe. For this we will take out the values of the dataframe using the values
attribute. The values
attribute of the dataframe contains a 2-D array containing the row values of the dataframe.
Once we get the values from the dataframe, we will convert the array to a list of arrays using the list()
function. The list()
function takes the values of the array as its input and returns the list of arrays as shown below.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list(df.values)
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[array([ 1, 100, 80, 90]), array([ 2, 80, 100, 90]), array([ 3, 90, 80, 70]), array([ 4, 100, 100, 90])]
In this example, you can observe that we have created a list of arrays from the dataframe.
Convert Pandas DataFrame to a List of Lists
Instead of creating a list of arrays, we can also convert pandas dataframe into a list of lists. For this, we can use two approaches.
Pandas DataFrame to a List of Lists Using iterrows() Method
To convert a dataframe into a list of lists, we will use the following approach.
- First, we will create an empty list to store the output list.
- Next, we will iterate through the rows of the dataframe using the
iterrows()
method and a for loop. While iteration, we will convert each row into a list before adding it to the output list. - To convert a row into a list, we will use the
tolist()
method. Thetolist()
method, when invoked on a row, returns the list of values in the row. We will add this list to the output list using theappend()
method.
After execution of the for loop, the pandas dataframe is converted to a list of lists. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list()
for index,row in df.iterrows():
rowList.append(row.tolist())
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[[1, 100, 80, 90], [2, 80, 100, 90], [3, 90, 80, 70], [4, 100, 100, 90]]
Using tolist() Method And The Values Attribute
Instead of using the iterrows()
method and the for loop, we can directly convert the pandas dataframe to a list of lists using the values
attribute. For this, we will first obtain the values in the data frame using the values attribute. Next, we will invoke the tolist()
method on the values. This will give us the list of lists created from the dataframe. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=df.values.tolist()
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[[1, 100, 80, 90], [2, 80, 100, 90], [3, 90, 80, 70], [4, 100, 100, 90]]
Get a List of Column Names From Dataframe
To get a list of column names from a dataframe, you can use the columns
attribute. The columns
attribute of a dataframe contains a list having all the column names. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
nameList=df.columns
print("The list of column names is:")
print(nameList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of column names is:
Index(['Roll', 'Maths', 'Physics', 'Chemistry'], dtype='object')
Alternatively, you can pass the entire dataframe to the list()
function. When we pass a dataframe to the list()
function, it returns a list containing the columns of the dataframe. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
nameList=list(df)
print("The list of column names is:")
print(nameList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of column names is:
['Roll', 'Maths', 'Physics', 'Chemistry']
Convert Dataframe Column to a List in Python
To convert a dataframe column to a list, you can use the tolist()
method as shown in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rollList=df["Roll"].tolist()
print("The list of Roll column is:")
print(rollList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of Roll column is:
[1, 2, 3, 4]
In this example, you can observe that we have used the tolist()
method to convert a dataframe column to a list.
Conclusion
In this article, we discussed different ways to convert pandas dataframe to list in python. We also discussed how to convert the dataframe to a list of rows as well as a list of lists. To know more about python programming, you can read this article on Dataframe Constructor Not Properly Called Error in Pandas. You might also like this article on how to split string into characters in Python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
The post Pandas DataFrame to List in Python appeared first on PythonForBeginners.com.
Real Python
The Real Python Podcast – Episode #149: Coding With namedtuple & Python's Dynamic Superpowers
Have you explored Python's collections module? Within it, you'll find a powerful factory function called namedtuple(), which provides multiple enhancements over the standard tuple for writing clearer and cleaner code. This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder's Weekly articles and projects.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
eGenix.com
PyDDF Python Spring Sprint 2023
The following text is in German, since we're announcing a Python sprint in Düsseldorf, Germany.
Ankündigung
Python Meeting Herbst Sprint 2023 in
Düsseldorf
Samstag, 25.03.2023, 10:00-18:00 Uhr
Sonntag, 26.03.2023. 10:00-18:00 Uhr
Atos Information Technology GmbH, Am Seestern 1, 40547 Düsseldorf
Informationen
Das Python Meeting Düsseldorf (PyDDF) veranstaltet mit freundlicher Unterstützung der Atos Information Technology GmbH ein Python Sprint Wochenende.Der Sprint findet am Wochenende 25./26.03.2023 in der Atos Niederlassung, Am Seestern 1, in Düsseldorf statt.Folgende Themengebiete sind als Anregung bereits angedacht:
- Workflows mit Ray
- Time Tracker mit automatischem Berücksichtigen von betrieblichen Pausenvorgaben
- Monitoring für Balkonkraftwerke: Prometheus Exporter und MQTT
- Distanzmessung von Qrovo Nodes via Micropython
- Projekt manage_django_project (Helper to develop Django projects)
Anmeldung, Kosten und weitere Infos
Alles weitere und die Anmeldung findet Ihr auf der Meetup Sprint Seite:
WICHTIG: Ohne Anmeldung können wir den Gebäudezugang nicht vorbereiten. Eine spontane Anmeldung am Sprint Tag wird daher vermutlich nicht funktionieren. Also bitte unbedingt mit vollen Namen bis spätestens am Freitag, 24.03., über Meetup anmelden.
Teilnehmer sollten sich zudem in der PyDDF Telegram Gruppe registrieren, da wir uns dort koordinieren:
Über das Python Meeting Düsseldorf
Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python-Begeisterte aus der Region wendet.
Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf.
Marc-André Lemburg, eGenix.com
Python Engineering at Microsoft
Introducing the Data Wrangler extension for Visual Studio Code Insiders
We’re excited to announce the launch of Data Wrangler, a revolutionary tool for data scientists and analysts who work with tabular data in Python. Data Wrangler is an extension for VS Code Insiders and the first step towards our vision of simplifying and expediting the data preparation process on Microsoft platforms.
Data preparation, cleaning, and visualization is a time-consuming task for many data scientists, but with Data Wrangler we’ve developed a solution that simplifies this process. Our goal is to make this process more accessible and efficient for everyone, to free up your time to focus on other parts of the data science workflow. To try Data Wrangler today, go to the Extension Marketplace tab in VS Code Insiders and search for “Data Wrangler”. To learn more about Data Wrangler, check out the documentation here: https://aka.ms/datawrangler.
With Data Wrangler, you can seamlessly clean and explore your data in VS Code Insiders. It offers a variety of features that will help you quickly identify and fix errors, inconsistencies, and missing data. You can perform data profiling and data quality checks, visualize data distributions, and easily transform data into the format you need. Plus, Data Wrangler comes with a library of built-in transformations and visualizations, so you can focus on your data, not the code. As you make changes, the tool generates code using open-source Python libraries for the data transformation operations you perform. This means you can write better data preparation programs faster and with fewer errors. The code also keeps Data Wrangler transparent and helps you verify the correctness of the operation as you go.
In a recent study, Python data scientists using the Pandas dataframe library report spending the majority (~51%) of their time preparing, cleaning and visualizing data for their models (Anaconda State of Data Science Report 2022). This activity is critical to the success of their projects, as poor data quality directly impacts the quality of the predictions made by their models. Furthermore, this activity is not predictable: the industry even calls it exploratory data analysis to capture the fact that it is often highly creative, requiring experimentation, visualization, comparison and iteration. However, despite the activity being creative and iterative, the individual operations are not – they involve writing small code snippets that drop columns, remove missing values, etc. But today there isn’t tooling support that makes it easier; In our research with data scientists, we regularly see them searching for and copy-pasting snippets of code from Stack Overflow into their programs.
Data Wrangler Interface
With Data Wrangler, we’ve developed an interactive UI that writes the code for you. As you inspect and visualize your Pandas dataframes using Data Wrangler, generating the code for your desired operations is easy. For instance, if you want to remove a column, you can right-click on the column heading and delete it, and Data Wrangler will generate the Python code to do that. If you want to remove rows containing missing values or substitute them with a computed default value, you can do that directly from the UI. If you want to reformat a categorical column by one-hot encoding it to make it suitable for machine learning algorithms, you can do so with a single command.
Create column from examples
Data scientists often need to create a new derived column from existing columns in their Pandas dataframe, which usually involves writing custom code that can easily become a source of bugs. With Data Wrangler, all you need to do is provide examples of how you want the data in the derived column to look like, and PROSE, our AI-powered program synthesis technology (the same technology that powers Microsoft Excel’s Flash Fill feature), will write the Python code for you. If you find an error in the results, you can correct it with a new example, and PROSE will rewrite the Python code to produce a better result. You can even modify the generated code yourself.
How to try Data Wrangler
To start using Data Wrangler today in Visual Studio Code Insiders, just download the Data Wrangler extension from the marketplace and visit our getting started page to try it out! You can then launch Data Wrangler from any Pandas dataframe output in a Jupyter Notebook, or by right-clicking any CSV or Parquet file in VS Code and selecting “Open in Data Wrangler”.
This is the first release of Data Wrangler so we are looking for feedback as we iterate on the product. Please provide any product feedback here. If you run into any issues, please file a bug report in our Github repo here. Our plan is to move the extension from VS Code Insiders to VS Code in the near future.
The post Introducing the Data Wrangler extension for Visual Studio Code Insiders appeared first on Python.
March 16, 2023
PyCharm
PyCharm 2023.1 Release Candidate Is Out!
PyCharm 2023.1 is just around the corner! Check out the fixes and improvements we added to the PyCharm 2023.1 Release Candidate.
To see what has already been added in PyCharm 2023.1 during the early access program, take a look at our EAP blog posts.
The Toolbox App is the easiest way to get the EAP builds and keep both your stable and EAP versions up to date. You can also manually download the EAP builds from our website.

Faster variable value previews for large collections
We optimized the performance of the Special Variable window available in the Python Console and Debug Console. A preview of the calculated variable values is now displayed faster, even for large collections such as arrays, deques, dictionaries, lists, sets, frozensets, and tuples.

To see the full list of variable values, click on View as …, Load next to the variable preview.
Instant access to the Python Console and Python Packages tool window
As part of our work to refine the new UI, we’ve put the Python Console icon on the main editor screen so that you can instantly navigate to the console when needed. Next to the Python Console icon, you can now find the icon for the Python Packages tool window, so you can quickly manage project packages.

The Release Candidate also delivers the following fixes:
- Creating a remote Docker interpreter with an ENTRYPOINT defined in a Docker image or Dockerfile no longer leads to an error. [PY-55444]
- Breakpoints are hit as expected when debugging a FastAPI project. [PY-57217]
- Attaching a debugger to the process now works for ARM (Mac and Ubuntu). [PY-44191]
- Virtualenv is now successfully activated on PowerShell in the Terminal. [PY-53890]
- No warnings are shown for undocumented methods of subclasses with docstrings. The same works for an undocumented class in cases when it has a documented superclass. [PY-30967]
These are the most important updates for PyCharm 2023.1 Release Candidate. For the full list of improvements, check out the release notes. Share your feedback on the new features in the comments below, on Twitter, or in our issue tracker.
Stack Abuse
Python TypeError: < not supported between instances of str and int
Introduction
In this article, we'll be taking a look at a common Python 3 error: TypeError: '<' not supported between instances of 'str' and 'int'
. This error occurs when an attempt is made to compare a string and an integer using the less than (<
) operator. We will discuss the reasons behind this error and provide solutions for fixing it. Additionally, we will also cover how to resolve this error when dealing with lists, floats, and tuples.
Why You Get This Error
In most languages, you probably know that the less than (<
) operator is used for comparison between two values. However, it is important to note that the two values being compared must be of the same type or be implicitly convertible to a common type. For example, two implicitly convertable types would be an integer and a float since they're both numbers. But in this specific case, we're trying to compare a string and an integer, which are not implicitly convertable.
When you try to compare a string and an integer using one of the comparison operators, Python raises a TypeError
because it cannot convert one of the values to a common type.
For instance, consider the following code:
num = 42
text = "hello"
if num < text:
print("The number is smaller than the text.") # confused.jpg
In this example, the comparison between num
(an integer) and text
(a string) using the less than operator will raise the error:
TypeError: '<' not supported between instances of 'str' and 'int'
How to Fix this Error
To fix this error, you need to ensure that both values being compared are of the same type or can be implicitly converted to a common type. In most cases, this means converting the integer to a string or vice versa.
Using a similar example as above, here's how you can resolve the error:
- Convert the integer to a string:
num = 42
text = "46"
if str(num) < text:
print("The number is smaller than the text.")
- Convert the string to an integer:
num = 42
text = "46"
# Assuming 'text' represents a numeric value
numeric_text = int(text)
if num < numeric_text:
print("The number is smaller than the text.")
This example works because the string does represent an integer. If, however, we were to try this fix on the example at the beginning of this article, it wouldn't work and Python would raise another error since the given text is not convertable to an integer. So while this fix works in some use-cases, it is not universal.
Fixing this Error with Lists
When working with lists, the error may occur when trying to compare an integer (or other primitive types) to a list.
my_list = ["1", "2", "3"]
if my_list > 3:
print("Greater than 3!")
TypeError: '>' not supported between instances of 'list' and 'int'
In this case, the fix really depends on your use-case. When you run into this error, the common problem is that you meant to compare the variable to a single element in the list, not the entire list itself. Therefore, in this case you will want to access a single element in the list to make the comparison. Again, you'll need to make sure the elements are of the same type.
my_list = ["1", "2", "3"]
if int(my_list[1]) > 3:
print("Greater than 3!")
Greater than 3!
Fixing this Error with Floats
When comparing floats and strings, the same error will arise as it's very similar to our first example in this article. To fix this error, you need to convert the float or the string to a common type, just like with integers:
num = 3.14
text = "3.15"
if float(text) < num:
print("The text as a float is smaller than the number.")
Fixing this Error with Tuples
When working with tuples, just like lists, if you try to compare the entire tuple to a primitive value, you'll run into the TypeError
. And just like before, you'll likely want to do the comparison on just a single value of the tuple.
Another possibility is that you'll want to compare all of the elements of the tuple to a single value, which we've shown below using the built-in all()
function:
str_tuple = ("1.2", "3.2", "4.4")
my_float = 6.8
if all(tuple(float(el) < my_float for el in str_tuple)):
print("All are lower!")
In this example, we iterate through all elements in the touple, convert them to floats, and then make the comparison. The resulting tuple is then checked for all True
values using all()
. This way we're able to make an element-by-element comparison.
Conclusion
In this article, we discussed the TypeError: '<' not supported between instances of 'str' and 'int'
error in Python, which can occur when you try to compare a string and an integer using comparison operators. We provided solutions for fixing this error by converting the values to a common type and also covered how to resolve this error when working with lists, floats, and tuples.
By understanding the root cause of this error and applying the appropriate type conversion techniques, you can prevent this error from occurring and ensure your comparisons work as intended.
eGenix.com
Python Meeting Düsseldorf - 2023-03-22
The following text is in German, since we're announcing a regional user group meeting in Düsseldorf, Germany.
Ankündigung
Das nächste Python Meeting Düsseldorf findet an folgendem Termin statt:
22.03.2023, 18:00 Uhr
Raum 1, 2.OG im Bürgerhaus Stadtteilzentrum Bilk
Düsseldorfer Arcaden, Bachstr. 145, 40217 Düsseldorf
Programm
Bereits angemeldete Vorträge
Charlie Clark:"Eine neue XML Bibliothek: Pugixml"
Marc-Andre Lemburg:
"Data Analysis with OpenSearch – Waiting times at the DUS airport"
Arkadius Schuchhardt:
"Concurrent data loading by using the thread producer-consumer pattern"
Many Kasiriha:
"Einführung in das Robot Framework"
Weitere Vorträge können gerne noch angemeldet werden. Bei Interesse, bitte unter info@pyddf.de melden.
Startzeit und Ort
Wir treffen uns um 18:00 Uhr im Bürgerhaus in den Düsseldorfer Arcaden.
Das Bürgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet
sich an der Seite der Tiefgarageneinfahrt der Düsseldorfer Arcaden.
Über dem Eingang steht ein großes "Schwimm’ in Bilk" Logo. Hinter der Tür
direkt links zu den zwei Aufzügen, dann in den 2. Stock hochfahren. Der
Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt.
>>> Eingang in Google Street View
Corona
Die Corona Einschränkungen sind mittlerweile aufgehoben worden. Vorsicht ist zwar immer noch geboten, aber jetzt jedem selbst überlassen.
⚠️ Wichtig: Bitte nur dann anmelden, wenn ihr absolut sicher seid, dass ihr auch kommt. Angesichts der begrenzten Anzahl Plätze, haben wir kein Verständnis für kurzfristige Absagen oder No-Shows.
Einleitung
Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.
Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf:
Format
Das Python Meeting Düsseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion.
Vorträge können vorher angemeldet werden, oder auch spontan während des Treffens eingebracht werden. Ein Beamer mit XGA Auflösung steht zur Verfügung.(Lightning) Talk Anmeldung bitte formlos per EMail an info@pyddf.de
Kostenbeteiligung
Das Python Meeting Düsseldorf wird von Python Nutzern für Python Nutzer veranstaltet.
Da Tagungsraum, Beamer, Internet und Getränke Kosten produzieren, bitten wir die Teilnehmer um einen Beitrag in Höhe von EUR 10,00 inkl. 19% Mwst. Schüler und Studenten zahlen EUR 5,00 inkl. 19% Mwst.
Wir möchten alle Teilnehmer bitten, den Betrag in bar mitzubringen.
Anmeldung
Da wir nur 25 Personen in dem angemieteten Raum empfangen können, möchten wir bitten, sich vorher anzumelden.
Meeting Anmeldung bitte per Meetup
Weitere Informationen
Weitere Informationen finden Sie auf der Webseite des Meetings:
https://pyddf.de/
Viel Spaß !
Marc-Andre Lemburg, eGenix.com
Matt Layman
Cater Waiter, Template Bugs, and Type Fixes - Building SaaS with Python and Django #155
In this episode, I did another Exercism problem in Python that dug into Python sets. Once the exercise was complete, we went back to the issue list. I debugged and fixed a template error, the spent time improving types with my Django app.