skip to navigation
skip to content

Planet Python

Last update: May 21, 2019 01:47 PM UTC

May 21, 2019


Red Hat Developers

Use the Kubernetes Python client from your running Red Hat OpenShift pods

Red Hat OpenShift is part of the Cloud Native Computing Foundation (CNCF) Certified Program, ensuring portability and interoperability for your container workloads. This also allows you to use Kubernetes tools to interact with an OpenShift cluster, like kubectl, and you can rest assured that all the APIs you know and love are right there at your fingertips.

The Kubernetes Python client is another great tool for interacting with an OpenShift cluster, allowing you to perform actions on Kubernetes resources with Python code. It also has applications within a cluster. We can configure a Python application running on OpenShift to consume the OpenShift API, and list and create resources. We could then create containerized batch jobs from the running application, or a custom service monitor, for example. It sounds a bit like “OpenShift inception,” using the OpenShift API from services created using the OpenShift API.

In this article, we’ll create a Flask application running on OpenShift. This application will use the Kubernetes Python client to interact with the OpenShift API, list other pods in the project, and display them back to the user.

You’ll need a couple of things to follow along:

Let’s get started!

Setup

I’ve created a template to alleviate a lot of the boilerplate, so let’s clone it down:

git clone https://github.com/shaneboulden/openshift-client-demo
cd openshift-client-demo

You can create a new app on your OpenShift cluster using the provided template and see the application running:

oc new-app openshift_deploy/ocd.yaml

If you do an oc get routes, you’ll be able to see the route that’s been created. For now, if you select the Pods menu item you’ll just get some placeholder text. We’ll fix this shortly 🙂

pods_placeholder

Configure the Kubernetes Python client

Listing pods is trivial once we have our client configured, and, fortunately, we can use a little Kubernetes Python client magic to configure this easily with the correct service account token.

Usually, we’d configure a Kubernetes client using a kubeconfig file, which has the required token and hostname to create API requests. The Kubernetes Python client also provides a method load_incluster_config(), which replaces the kubeconfig file in a running pod, instead using the available environment variables and mount points to find the service account token and build API URLs from the information available within the pod.

There’s another huge benefit to using load_incluster_config()—our code is now portable. We can take this same application to any Kubernetes cluster, assume nothing about hostnames or network addresses, and easily construct API requests using this awesome little method.

Let’s configure our application to use the load_incluster_config() method. First, we need to import the client and config objects, you can verify this in the ocd.py file:

from kubernetes import client, config

We can now use that magic method to configure the client:

config.load_incluster_config()
v1 = client.CoreV1Api()

That’s it! This is all of the code we need to be able to interact with the OpenShift API from running pods.

Use the Kubernetes Downward API

I’m going to introduce something new here, and yes, it’s another “OpenShift-inception” concept. We’re going to use the list_namespaced_pod method to list pod details; you can find all of the methods available in the documentation. To use this method, we need to pass the current namespace (project) to the Kubernetes client object. But wait, how do we get the namespace for our pod, from inside the running pod?

This is where another awesome Kubernetes API comes into play. It’s called the Downward API and allows us to access metadata about our pod from inside the running pod. To expose information from the Downward API to our pod, we can use environment variables. If you look at the template, you’ll see the following in the ‘env’ section:

- name: POD_NAMESPACE
  valueFrom:
    fieldRef:
      apiVersion: v1
      fieldPath: metadata.namespace

Bring it all together

Now let’s get back to our /pods route in the ocd.py file. The last thing we need to do is to pass the namespace of the app to the Kubernetes client. We have our environment variable configured to use the downward API already, so let’s pass it in:

pods = v1.list_namespaced_pod(namespace=os.environ["POD_NAMESPACE"])

Ensure you’re in the top-level project directory (i.e., you can see the README) and start a build from the local directory:

oc start-build openshift-client-demo --from-dir=.

When you next visit the route and select the Pods menu, you’ll be able to see all of the pods for the current namespace:

pods

I hope you’ve enjoyed this short introduction to the Kubernetes Python client. If you want to explore a little deeper, you can look at creating resources. There’s an example here that looks at creating containerized batch jobs from API POSTs.

Share

The post Use the Kubernetes Python client from your running Red Hat OpenShift pods appeared first on Red Hat Developer Blog.

May 21, 2019 01:39 PM UTC


Test and Code

74: Technical Interviews: Preparing For, What to Expect, and Tips for Success - Derrick Mar

In this episode, I talk with Derrick Mar, CTO and co-founder of Pathrise.
This is the episode you need to listen to to get ready for software interviews.

If you or anyone you know has a software interview coming up, this episode will help you both feel more comfortable about the interview before you show up, and give you concrete tips on how to do better during the interview.

Special Guest: Derrick Mar.

Sponsored By:

Support Test & Code - Software Testing, Development, Python

Links:

<p>In this episode, I talk with Derrick Mar, CTO and co-founder of Pathrise.<br> This is the episode you need to listen to to get ready for software interviews.</p> <ul> <li><p>We discuss four aspects of technical interviews that interviewers are looking for:</p> <ul> <li>communication</li> <li>problem solving</li> <li>coding</li> <li>verification</li> </ul></li> <li><p>How to practice for the interview.</p></li> <li><p>Techniques for synchronizing with interviewer and asking for hints.</p></li> <li><p>Even how to ask the recruiter or hiring manager how to prepare for the interview.</p></li> </ul> <p>If you or anyone you know has a software interview coming up, this episode will help you both feel more comfortable about the interview before you show up, and give you concrete tips on how to do better during the interview.</p><p>Special Guest: Derrick Mar.</p><p>Sponsored By:</p><ul><li><a href="http://amzn.to/2E6cYZ9" rel="nofollow">Python Testing with pytest</a>: <a href="http://amzn.to/2E6cYZ9" rel="nofollow">Simple, Rapid, Effective, and Scalable The fastest way to learn pytest. From 0 to expert in under 200 pages.</a></li><li><a href="https://www.patreon.com/testpodcast" rel="nofollow">Patreon Supporters</a>: <a href="https://www.patreon.com/testpodcast" rel="nofollow">Help support the show with as little as $1 per month. Funds help pay for expenses associated with the show.</a></li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code - Software Testing, Development, Python</a></p><p>Links:</p><ul><li><a href="https://testandcode.com/72" title="72: Technical Interview Fixes - April Wensel" rel="nofollow">72: Technical Interview Fixes - April Wensel</a></li><li><a href="https://www.pathrise.com/" title="Pathrise" rel="nofollow">Pathrise</a></li></ul>

May 21, 2019 07:00 AM UTC


codingdirectional

Growth of a Population

Hello and welcome back, in this episode we are going to solve a python related problem in Codewars. Before we start I just want to say that this post is related to python programming, you are welcome to leave your comments below this post if and only if they are related to the below solution, kindly do not leave any comment which has nothing to do with python programming under this article, thank you.

In a small town, the population isat p0 = 1000 at the beginning of a year. The population regularly increases by 2 percent per year and moreover, 50 new inhabitants per year come to live in the town. How many years does the town need to see its population greater or equal to p = 1200inhabitants? You need to round up the percentage part of the equation. Below is the entire solution to this question.

At the end of the first year there will be: 
1000 + 1000 * 0.02 + 50 => 1070 inhabitants

At the end of the 2nd year there will be: 
1070 + 1070 * 0.02 + 50 => 1141 inhabitants (number of inhabitants is an integer)

At the end of the 3rd year there will be:
1141 + 1141 * 0.02 + 50 => 1213

It will need 3 entire years.

So how are we going to turn the above population equation into a function?

def nb_year(p0, percent, aug, p):
    # p0 is the present total population, aug is the number of new inhabitants per year and p is the target population needs to be surpassed
    perc = round(p0 * percent/100)
    total_population = p0 + (perc) + aug
    year = 1
    while(total_population < p):
        perc = round(total_population * percent/100)
        total_population = total_population + perc + aug
        year += 1

    return year

Simple solution, hope you do enjoy this post. We will start a new project soon so stay tuned!

May 21, 2019 03:49 AM UTC


Catalin George Festila

Python 3.7.3 : Use the tweepy to deal with twitter api - part 002.

The tutorial for today is about with python 3.7.3 The development team comes with this intro: An easy-to-use Python library for accessing the Twitter API. You need to have an application on twitter with tokens and key, see here. The first step is to install this python module: C:\Python373\Scripts>pip install tweepy ... Successfully built PySocks Installing collected packages: PySocks, tweepy

May 21, 2019 01:20 AM UTC

May 20, 2019


Andre Roberge

Test - please ignore

Simple test to embed a gist.


That's all.

May 20, 2019 08:40 PM UTC


Codementor

Python's Counter - Part 2

This is the second part of the Counter tutorial sequel. It explores certain behaviour of Counter.

May 20, 2019 03:03 PM UTC


Curtis Miller

What is the probability that in a box of a dozen donuts picked from 14 flavors there’s no more than 3 flavors in the box?

Dave's Donuts offers 14 flavors of donuts (consider the supply of each flavor as being unlimited). The “grab bag” box consists of flavors randomly selected to be in the box, each flavor equally likely for each one of the dozen donuts. What is the probability that at most three flavors are in the grab bag box of a dozen?

May 20, 2019 03:00 PM UTC


Stack Abuse

Overview of Async IO in Python 3.7

Python 3's asyncio module provides fundamental tools for implementing asynchronous I/O in Python. It was introduced in Python 3.4, and with each subsequent minor release, the module has evolved significantly.

This tutorial contains a general overview of the asynchronous paradigm, and how it's implemented in Python 3.7.

Blocking vs Non-Blocking I/O

The problem that asynchrony seeks to resolve is blocking I/O.

By default, when your program accesses data from an I/O source, it waits for that operation to complete before continuing to execute the program.

with open('myfile.txt', 'r') as file:  
    data = file.read()
    # Until the data is read into memory, the program waits here
print(data)  

The program is blocked from continuing its flow of execution while a physical device is accessed, and data is transferred.

Network operations are another common source of blocking:

# pip install --user requests
import requests

req = requests.get('https://www.stackabuse.com/')

#
# Blocking occurs here, waiting for completion of an HTTPS request
#

print(req.text)  

In many cases, the delay caused by blocking is negligible. However, blocking I/O scales very poorly. If you need to wait for 1010 file reads or network transactions, performance will suffer.

Multiprocessing, Threading, and Asynchrony

Strategies for minimizing the delays of blocking I/O fall into three major categories: multiprocessing, threading, and asynchrony.

Multiprocessing

Multiprocessing is a form of parallel computing: instructions are executed in an overlapping time frame on multiple physical processors or cores. Each process spawned by the kernel incurs an overhead cost, including an independently-allocated chunk of memory (heap).

Python implements parallelism with the multiprocessing module.

The following is an example of a Python 3 program that spawns four child processes, each of which exhibits a random, independent delay. The output shows the process ID of each child, the system time before and after each delay, and the current and peak memory allocation at each step.

from multiprocessing import Process  
import os, time, datetime, random, tracemalloc

tracemalloc.start()  
children = 4    # number of child processes to spawn  
maxdelay = 6    # maximum delay in seconds

def status():  
    return ('Time: ' + 
        str(datetime.datetime.now().time()) +
        '\t Malloc, Peak: ' +
        str(tracemalloc.get_traced_memory()))

def child(num):  
    delay = random.randrange(maxdelay)
    print(f"{status()}\t\tProcess {num}, PID: {os.getpid()}, Delay: {delay} seconds...")
    time.sleep(delay)
    print(f"{status()}\t\tProcess {num}: Done.")

if __name__ == '__main__':  
    print(f"Parent PID: {os.getpid()}")
    for i in range(children):
        proc = Process(target=child, args=(i,))
        proc.start()

Output:

Parent PID: 16048  
Time: 09:52:47.014906    Malloc, Peak: (228400, 240036)     Process 0, PID: 16051, Delay: 1 seconds...  
Time: 09:52:47.016517    Malloc, Peak: (231240, 240036)     Process 1, PID: 16052, Delay: 4 seconds...  
Time: 09:52:47.018786    Malloc, Peak: (231616, 240036)     Process 2, PID: 16053, Delay: 3 seconds...  
Time: 09:52:47.019398    Malloc, Peak: (232264, 240036)     Process 3, PID: 16054, Delay: 2 seconds...  
Time: 09:52:48.017104    Malloc, Peak: (228434, 240036)     Process 0: Done.  
Time: 09:52:49.021636    Malloc, Peak: (232298, 240036)     Process 3: Done.  
Time: 09:52:50.022087    Malloc, Peak: (231650, 240036)     Process 2: Done.  
Time: 09:52:51.020856    Malloc, Peak: (231274, 240036)     Process 1: Done.  

Threading

Threading is an alternative to multiprocessing, with benefits and downsides.

Threads are independently scheduled, and their execution may occur within an overlapping time period. Unlike multiprocessing, however, threads exist entirely in a single kernel process, and share a single allocated heap.

Python threads are concurrent — multiple sequences of machine code are executed in overlapping time frames. But they are not parallel — execution does not occur simultaneously on multiple physical cores.

The primary downsides to Python threading are memory safety and race conditions. All child threads of a parent process operate in the same shared memory space. Without additional protections, one thread may overwrite a shared value in memory without other threads being aware of it. Such data corruption would be disastrous.

To enforce thread safety, CPython implementations use a global interpreter lock (GIL). The GIL is a mutex mechanism that prevents multiple threads from executing simultaneously on Python objects. Effectively, this means that only one thread runs at any given time.

Here's the threaded version of the multiprocessing example from the previous section. Notice that very little has changed: multiprocessing.Process is replaced with threading.Thread. As indicated in the output, everything happens in a single process, and the memory footprint is significantly smaller.

from threading import Thread  
import os, time, datetime, random, tracemalloc

tracemalloc.start()  
children = 4    # number of child threads to spawn  
maxdelay = 6    # maximum delay in seconds

def status():  
    return ('Time: ' + 
        str(datetime.datetime.now().time()) +
        '\t Malloc, Peak: ' +
        str(tracemalloc.get_traced_memory()))

def child(num):  
    delay = random.randrange(maxdelay)
    print(f"{status()}\t\tProcess {num}, PID: {os.getpid()}, Delay: {delay} seconds...")
    time.sleep(delay)
    print(f"{status()}\t\tProcess {num}: Done.")

if __name__ == '__main__':  
    print(f"Parent PID: {os.getpid()}")
    for i in range(children):
        thr = Thread(target=child, args=(i,))
        thr.start()

Output:

Parent PID: 19770  
Time: 10:44:40.942558    Malloc, Peak: (9150, 9264)     Process 0, PID: 19770, Delay: 3 seconds...  
Time: 10:44:40.942937    Malloc, Peak: (13989, 14103)       Process 1, PID: 19770, Delay: 5 seconds...  
Time: 10:44:40.943298    Malloc, Peak: (18734, 18848)       Process 2, PID: 19770, Delay: 3 seconds...  
Time: 10:44:40.943746    Malloc, Peak: (23959, 24073)       Process 3, PID: 19770, Delay: 2 seconds...  
Time: 10:44:42.945896    Malloc, Peak: (26599, 26713)       Process 3: Done.  
Time: 10:44:43.945739    Malloc, Peak: (26741, 27223)       Process 0: Done.  
Time: 10:44:43.945942    Malloc, Peak: (26851, 27333)       Process 2: Done.  
Time: 10:44:45.948107    Malloc, Peak: (24639, 27475)       Process 1: Done.  

Asynchrony

Asynchrony is an alternative to threading for writing concurrent applications. Asynchronous events occur on independent schedules, "out of sync" with one another, entirely within a single thread.

Unlike threading, in asynchronous programs the programmer controls when and how voluntary preemption occurs, making it easier to isolate and avoid race conditions.

Introduction to the Python 3.7 asyncio Module

In Python 3.7, asynchronous operations are provided by the asyncio module.

High-Level vs Low-Level asyncio API

Asyncio components are divided into high-level APIs (for writing programs), and low-level APIs (for writing libraries or frameworks based on asyncio).

Every asyncio program can be written using only the high-level APIs. If you're not writing a framework or library, you never need to touch the low-level stuff.

With that said, let's look at the core high-level APIs, and discuss the core concepts.

Coroutines

In general, a coroutine (short for cooperative subroutine) is a function designed for voluntary preemptive multitasking: it proactively yields to other routines and processes, rather than being forcefully preempted by the kernel. The term "coroutine" was coined in 1958 by Melvin Conway (of "Conway's Law" fame), to describe code that actively facilitates the needs of other parts of a system.

In asyncio, this voluntary preemption is called awaiting.

Awaitables, Async, and Await

Any object which can be awaited (voluntarily preempted by a coroutine) is called an awaitable.

The await keyword suspends the execution of the current coroutine, and calls the specified awaitable.

In Python 3.7, the three awaitable objects are coroutine, task, and future.

An asyncio coroutine is any Python function whose definition is prefixed with the async keyword.

async def my_coro():  
    pass

An asyncio task is an object that wraps a coroutine, providing methods to control its execution, and query its status. A task may be created with asyncio.create_task(), or asyncio.gather().

An asyncio future is a low-level object that acts as a placeholder for data that hasn't yet been calculated or fetched. It can provide an empty structure to be filled with data later, and a callback mechanism that is triggered when the data is ready.

A task inherits all but two of the methods available to a future, so in Python 3.7 you never need to create a future object directly.

Event Loops

In asyncio, an event loop controls the scheduling and communication of awaitable objects. An event loop is required to use awaitables. Every asyncio program has at least one event loop. It's possible to have multiple event loops, but multiple event loops are strongly discouraged in Python 3.7.

A reference to the currently-running loop object is obtained by calling asyncio.get_running_loop().

Sleeping

The asyncio.sleep(delay) coroutine blocks for delay seconds. It's useful for simulating blocking I/O.

import asyncio

async def main():  
    print("Sleep now.")
    await asyncio.sleep(1.5)
    print("OK, wake up!")

asyncio.run(main())  
Initiating the Main Event Loop

The canonical entrance point to an asyncio program is asyncio.run(main()), where main() is a top-level coroutine.

import asyncio

async def my_coro(arg):  
    "A coroutine."  
    print(arg)

async def main():  
    "The top-level coroutine."
    await my_coro(42)

asyncio.run(main())  

Calling asyncio.run() implicitly creates and runs an event loop. The loop object has many useful methods, including loop.time(), which returns a float representing the current time, as measured by the loop's internal clock.

Note: The asyncio.run() function cannot be called from within an existing event loop. Therefore, it is possible that you see errors if you're running the program within a supervising environment, such as Anaconda or Jupyter, which is running an event loop of its own. The example programs in this section and the following sections should be run directly from the command line by executing the python file.

The following program prints lines of text, blocking for one second after each line until the last.

import asyncio

async def my_coro(delay):  
    loop = asyncio.get_running_loop()
    end_time = loop.time() + delay
    while True:
        print("Blocking...")
        await asyncio.sleep(1)
        if loop.time() > end_time:
            print("Done.")
            break

async def main():  
    await my_coro(3.0)

asyncio.run(main())  

Output:

Blocking...  
Blocking...  
Blocking...  
Done.  
Tasks

A task is an awaitable object that wraps a coroutine. To create and immediately schedule a task, you can call the following:

asyncio.create_task(coro(args...))  

This will return a task object. Creating a task tells the loop, "go ahead and run this coroutine as soon as you can."

If you await a task, execution of the current coroutine is blocked until that task is complete.

import asyncio

async def my_coro(n):  
    print(f"The answer is {n}.")

async def main():  
    # By creating the task, it's scheduled to run 
    # concurrently, at the event loop's discretion.
    mytask = asyncio.create_task(my_coro(42))

    # If we later await the task, execution stops there
    # until the task is complete. If the task is already
    # complete before it is awaited, nothing is awaited. 
    await mytask

asyncio.run(main())  

Output:

The answer is 42.  

Tasks have several useful methods for managing the wrapped coroutine. Notably, you can request that a task be canceled by calling the task's .cancel() method. The task will be scheduled for cancellation on the next cycle of the event loop. Cancellation is not guaranteed: the task may complete before that cycle, in which case the cancellation does not occur.

Gathering Awaitables

Awaitables can be gathered as a group, by providing them as a list argument to the built-in coroutine asyncio.gather(awaitables).

The asyncio.gather() returns an awaitable representing the gathered awaitables, and therefore must be prefixed with await.

If any element of awaitables is a coroutine, it is immediately scheduled as a task.

Gathering is a convenient way to schedule multiple coroutines to run concurrently as tasks. It also associates the gathered tasks in some useful ways:

Example: Async Web Requests with aiohttp

The following example illustrates how these high-level asyncio APIs can be implemented. The following is a modified version, updated for Python 3.7, of Scott Robinson's nifty asyncio example. His program leverages the aiohttp module to grab the top posts on Reddit, and output them to the console.

Make sure that you have aiohttp module installed before you run the script below. You can download the module via the following pip command:

$ pip install --user aiohttp
import sys  
import asyncio  
import aiohttp  
import json  
import datetime

async def get_json(client, url):  
    async with client.get(url) as response:
        assert response.status == 200
        return await response.read()

async def get_reddit_top(subreddit, client, numposts):  
    data = await get_json(client, 'https://www.reddit.com/r/' + 
        subreddit + '/top.json?sort=top&t=day&limit=' +
        str(numposts))

    print(f'\n/r/{subreddit}:')

    j = json.loads(data.decode('utf-8'))
    for i in j['data']['children']:
        score = i['data']['score']
        title = i['data']['title']
        link = i['data']['url']
        print('\t' + str(score) + ': ' + title + '\n\t\t(' + link + ')')

async def main():  
    print(datetime.datetime.now().strftime("%A, %B %d, %I:%M %p"))
    print('---------------------------')
    loop = asyncio.get_running_loop()  
    async with aiohttp.ClientSession(loop=loop) as client:
        await asyncio.gather(
            get_reddit_top('python', client, 3),
            get_reddit_top('programming', client, 4),
            get_reddit_top('asyncio', client, 2),
            get_reddit_top('dailyprogrammer', client, 1)
            )

asyncio.run(main())  

If you run the program multiple times, you'll see that the order of the output changes. That's because the JSON requests are displayed as they're received, which is dependent on the server's response time, and the intermediate network latency. On a Linux system, you can observe this in action by running the script prefixed with (e.g.) watch -n 5, which will refresh the output every 5 seconds:

Other High-level APIs

Hopefully, this overview gives you a solid foundation of how, when, and why to use asyncio. Other high-level asyncio APIs, not covered here, include:

Conclusion

Keep in mind that even if your program doesn't require asynchrony for performance reasons, you can still use asyncio if you prefer writing within the asynchronous paradigm. I hope this overview gives you a solid understanding of how, when, and why to begin using use asyncio.

May 20, 2019 02:10 PM UTC


Real Python

Unicode & Character Encodings in Python: A Painless Guide

Handling character encodings in Python or any other language can at times seem painful. Places such as Stack Overflow have thousands of questions stemming from confusion over exceptions like UnicodeDecodeError and UnicodeEncodeError. This tutorial is designed to clear the Exception fog and illustrate that working with text and binary data in Python 3 can be a smooth experience. Python’s Unicode support is strong and robust, but it takes some time to master.

This tutorial is different because it’s not language-agnostic but instead deliberately Python-centric. You’ll still get a language-agnostic primer, but you’ll then dive into illustrations in Python, with text-heavy paragraphs kept to a minimum. You’ll see how to use concepts of character encodings in live Python code.

By the end of this tutorial, you’ll:

Character encoding and numbering systems are so closely connected that they need to be covered in the same tutorial or else the treatment of either would be totally inadequate.

Note: This article is Python 3-centric. Specifically, all code examples in this tutorial were generated from a CPython 3.7.2 shell, although all minor versions of Python 3 should behave (mostly) the same in their treatment of text.

If you’re still using Python 2 and are intimidated by the differences in how Python 2 and Python 3 treat text and binary data, then hopefully this tutorial will help you make the switch.

Free Bonus: Click here to get access to a chapter from Python Tricks: The Book that shows you Python's best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

What’s a Character Encoding?

There are tens if not hundreds of character encodings. The best way to start understanding what they are is to cover one of the simplest character encodings, ASCII.

Whether you’re self-taught or have a formal computer science background, chances are you’ve seen an ASCII table once or twice. ASCII is a good place to start learning about character encoding because it is a small and contained encoding. (Too small, as it turns out.)

It encompasses the following:

So what is a more formal definition of a character encoding?

At a very high level, it’s a way of translating characters (such as letters, punctuation, symbols, whitespace, and control characters) to integers and ultimately to bits. Each character can be encoded to a unique sequence of bits. Don’t worry if you’re shaky on the concept of bits, because we’ll get to them shortly.

The various categories outlined represent groups of characters. Each single character has a corresponding code point, which you can think of as just an integer. Characters are segmented into different ranges within the ASCII table:

Code Point Range Class
0 through 31 Control/non-printable characters
32 through 64 Punctuation, symbols, numbers, and space
65 through 90 Uppercase English alphabet letters
91 through 96 Additional graphemes, such as [ and \
97 through 122 Lowercase English alphabet letters
123 through 126 Additional graphemes, such as { and |
127 Control/non-printable character (DEL)

The entire ASCII table contains 128 characters. This table captures the complete character set that ASCII permits. If you don’t see a character here, then you simply can’t express it as printed text under the ASCII encoding scheme.

Code Point Character (Name) Code Point Character (Name)
0 NUL (Null) 64 @
1 SOH (Start of Heading) 65 A
2 STX (Start of Text) 66 B
3 ETX (End of Text) 67 C
4 EOT (End of Transmission) 68 D
5 ENQ (Enquiry) 69 E
6 ACK (Acknowledgment) 70 F
7 BEL (Bell) 71 G
8 BS (Backspace) 72 H
9 HT (Horizontal Tab) 73 I
10 LF (Line Feed) 74 J
11 VT (Vertical Tab) 75 K
12 FF (Form Feed) 76 L
13 CR (Carriage Return) 77 M
14 SO (Shift Out) 78 N
15 SI (Shift In) 79 O
16 DLE (Data Link Escape) 80 P
17 DC1 (Device Control 1) 81 Q
18 DC2 (Device Control 2) 82 R
19 DC3 (Device Control 3) 83 S
20 DC4 (Device Control 4) 84 T
21 NAK (Negative Acknowledgment) 85 U
22 SYN (Synchronous Idle) 86 V
23 ETB (End of Transmission Block) 87 W
24 CAN (Cancel) 88 X
25 EM (End of Medium) 89 Y
26 SUB (Substitute) 90 Z
27 ESC (Escape) 91 [
28 FS (File Separator) 92 \
29 GS (Group Separator) 93 ]
30 RS (Record Separator) 94 ^
31 US (Unit Separator) 95 _
32 SP (Space) 96 `
33 ! 97 a
34 " 98 b
35 # 99 c
36 $ 100 d
37 % 101 e
38 & 102 f
39 ' 103 g
40 ( 104 h
41 ) 105 i
42 * 106 j
43 + 107 k
44 , 108 l
45 - 109 m
46 . 110 n
47 / 111 o
48 0 112 p
49 1 113 q
50 2 114 r
51 3 115 s
52 4 116 t
53 5 117 u
54 6 118 v
55 7 119 w
56 8 120 x
57 9 121 y
58 : 122 z
59 ; 123 {
60 < 124 |
61 = 125 }
62 > 126 ~
63 ? 127 DEL (delete)

The string Module

Python’s string module is a convenient one-stop-shop for string constants that fall in ASCII’s character set.

Here’s the core of the module in all its glory:

# From lib/python3.7/string.py

whitespace = ' \t\n\r\v\f'
ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = ascii_lowercase + ascii_uppercase
digits = '0123456789'
hexdigits = digits + 'abcdef' + 'ABCDEF'
octdigits = '01234567'
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
printable = digits + ascii_letters + punctuation + whitespace

Most of these constants should be self-documenting in their identifier name. We’ll cover what hexdigits and octdigits are shortly.

You can use these constants for everyday string manipulation:

>>>
>>> import string

>>> s = "What's wrong with ASCII?!?!?"
>>> s.rstrip(string.punctuation)
'What's wrong with ASCII'

Note: string.printable includes all of string.whitespace. This disagrees slightly with another method for testing whether a character is considered printable, namely str.isprintable(), which will tell you that none of {'\v', '\n', '\r', '\f', '\t'} are considered printable.

The subtle difference is because of definition: str.isprintable() considers something printable if “all of its characters are considered printable in repr().”

A Bit of a Refresher

Now is a good time for a short refresher on the bit, the most fundamental unit of information that a computer knows.

A bit is a signal that has only two possible states. There are different ways of symbolically representing a bit that all mean the same thing:

Our ASCII table from the previous section uses what you and I would just call numbers (0 through 127), but what are more precisely called numbers in base 10 (decimal).

You can also express each of these base-10 numbers with a sequence of bits (base 2). Here are the binary versions of 0 through 10 in decimal:

Decimal Binary (Compact) Binary (Padded Form)
0 0 00000000
1 1 00000001
2 10 00000010
3 11 00000011
4 100 00000100
5 101 00000101
6 110 00000110
7 111 00000111
8 1000 00001000
9 1001 00001001
10 1010 00001010

Notice that as the decimal number n increases, you need more significant bits to represent the character set up to and including that number.

Here’s a handy way to represent ASCII strings as sequences of bits in Python. Each character from the ASCII string gets pseudo-encoded into 8 bits, with spaces in between the 8-bit sequences that each represent a single character:

>>>
>>> def make_bitseq(s: str) -> str:
...     if not s.isascii():
...         raise ValueError("ASCII only allowed")
...     return " ".join(f"{ord(i):08b}" for i in s)

>>> make_bitseq("bits")
'01100010 01101001 01110100 01110011'

>>> make_bitseq("CAPS")
'01000011 01000001 01010000 01010011'

>>> make_bitseq("$25.43")
'00100100 00110010 00110101 00101110 00110100 00110011'

>>> make_bitseq("~5")
'01111110 00110101'

Note: .isascii() was introduced in Python 3.7.

The f-string f"{ord(i):08b}" uses Python’s Format Specification Mini-Language, which is a way of specifying formatting for replacement fields in format strings:

This trick is mainly just for fun, and it will fail very badly for any character that you don’t see present in the ASCII table. We’ll discuss how other encodings fix this problem later on.

We Need More Bits!

There’s a critically important formula that’s related to the definition of a bit. Given a number of bits, n, the number of distinct possible values that can be represented in n bits is 2n:

def n_possible_values(nbits: int) -> int:
    return 2 ** nbits

Here’s what that means:

There’s a corollary to this formula: given a range of distinct possible values, how can we find the number of bits, n, that is required for the range to be fully represented? What you’re trying to solve for is n in the equation 2n = x (where you already know x).

Here’s what that works out to:

>>>
>>> from math import ceil, log

>>> def n_bits_required(nvalues: int) -> int:
...     return ceil(log(nvalues) / log(2))

>>> n_bits_required(256)
8

The reason that you need to use a ceiling in n_bits_required() is to account for values that are not clean powers of 2. Say you need to store a character set of 110 characters total. Naively, this should take log(110) / log(2) == 6.781 bits, but there’s no such thing as 0.781 bits. 110 values will require 7 bits, not 6, with the final slots being unneeded:

>>>
>>> n_bits_required(110)
7

All of this serves to to prove one concept: ASCII is, strictly speaking, a 7-bit code. The ASCII table that you saw above contains 128 code points and characters, 0 through 127 inclusive. This requires 7 bits:

>>>
>>> n_bits_required(128)  # 0 through 127
7
>>> n_possible_values(7)
128

The issue with this is that modern computers don’t store much of anything in 7-bit slots. They traffic in units of 8 bits, conventionally known as a byte.

Note: Throughout this tutorial, I assume that a byte refers to 8 bits, as it has since the 1960s, rather than some other unit of storage. You are free to call this an octet if you prefer.

This means that the storage space used by ASCII is half-empty. If it’s not clear why this is, think back to the decimal-to-binary table from above. You can express the numbers 0 and 1 with just 1 bit, or you can use 8 bits to express them as 00000000 and 00000001, respectively.

You can express the numbers 0 through 3 with just 2 bits, or 00 through 11, or you can use 8 bits to express them as 00000000, 00000001, 00000010, and 00000011, respectively. The highest ASCII code point, 127, requires only 7 significant bits.

Knowing this, you can see that make_bitseq() converts ASCII strings into a str representation of bytes, where every character consumes one byte:

>>>
>>> make_bitseq("bits")
'01100010 01101001 01110100 01110011'

ASCII’s underutilization of the 8-bit bytes offered by modern computers led to a family of conflicting, informalized encodings that each specified additional characters to be used with the remaining 128 available code points allowed in an 8-bit character encoding scheme.

Not only did these different encodings clash with each other, but each one of them was by itself still a grossly incomplete representation of the world’s characters, regardless of the fact that they made use of one additional bit.

Over the years, one character encoding mega-scheme came to rule them all. However, before we get there, let’s talk for a minute about numbering systems, which are a fundamental underpinning of character encoding schemes.

Covering All the Bases: Other Number Systems

In the discussion of ASCII above, you saw that each character maps to an integer in the range 0 through 127.

This range of numbers is expressed in decimal (base 10). It’s the way that you, me, and the rest of us humans are used to counting, for no reason more complicated than that we have 10 fingers.

But there are other numbering systems as well that are especially prevalent throughout the CPython source code. While the “underlying number” is the same, all numbering systems are just different ways of expressing the same number.

If I asked you what number the string "11" represents, you’d be right to give me a strange look before answering that it represents eleven.

However, this string representation can express different underlying numbers in different numbering systems. In addition to decimal, the alternatives include the following common numbering systems:

But what does it mean for us to say that, in a certain numbering system, numbers are represented in base N?

Here is the best way that I know of to articulate what this means: it’s the number of fingers that you’d count on in that system.

If you want a much fuller but still gentle introduction to numbering systems, Charles Petzold’s Code is an incredibly cool book that explores the foundations of computer code in detail.

One way to demonstrate how different numbering systems interpret the same thing is with Python’s int() constructor. If you pass a str to int(), Python will assume by default that the string expresses a number in base 10 unless you tell it otherwise:

>>>
>>> int('11')
11
>>> int('11', base=10)  # 10 is already default
11
>>> int('11', base=2)  # Binary
3
>>> int('11', base=8)  # Octal
9
>>> int('11', base=16)  # Hex
17

There’s a more common way of telling Python that your integer is typed in a base other than 10. Python accepts literal forms of each of the 3 alternative numbering systems above:

Type of Literal Prefix Example
n/a n/a 11
Binary literal 0b or 0B 0b11
Octal literal 0o or 0O 0o11
Hex literal 0x or 0X 0x11

All of these are sub-forms of integer literals. You can see that these produce the same results, respectively, as the calls to int() with non-default base values. They’re all just int to Python:

>>>
>>> 11
11
>>> 0b11  # Binary literal
3
>>> 0o11  # Octal literal
9
>>> 0x11  # Hex literal
17

Here’s how you could type the binary, octal, and hexadecimal equivalents of the decimal numbers 0 through 20. Any of these are perfectly valid in a Python interpreter shell or source code, and all work out to be of type int:

Decimal Binary Octal Hex
0 0b0 0o0 0x0
1 0b1 0o1 0x1
2 0b10 0o2 0x2
3 0b11 0o3 0x3
4 0b100 0o4 0x4
5 0b101 0o5 0x5
6 0b110 0o6 0x6
7 0b111 0o7 0x7
8 0b1000 0o10 0x8
9 0b1001 0o11 0x9
10 0b1010 0o12 0xa
11 0b1011 0o13 0xb
12 0b1100 0o14 0xc
13 0b1101 0o15 0xd
14 0b1110 0o16 0xe
15 0b1111 0o17 0xf
16 0b10000 0o20 0x10
17 0b10001 0o21 0x11
18 0b10010 0o22 0x12
19 0b10011 0o23 0x13
20 0b10100 0o24 0x14

It’s amazing just how prevalent these expressions are in the Python Standard Library. If you want to see for yourself, navigate to wherever your lib/python3.7/ directory sits, and check out the use of hex literals like this:

$ grep -nri --include "*\.py" -e "\b0x" lib/python3.7

This should work on any Unix system that has grep. You could use "\b0o" to search for octal literals or “\b0b” to search for binary literals.

What’s the argument for using these alternate int literal syntaxes? In short, it’s because 2, 8, and 16 are all powers of 2, while 10 is not. These three alternate number systems occasionally offer a way for expressing values in a computer-friendly manner. For example, the number 65536 or 216, is just 10000 in hexadecimal, or 0x10000 as a Python hexadecimal literal.

Enter Unicode

As you saw, the problem with ASCII is that it’s not nearly a big enough set of characters to accommodate the world’s set of languages, dialects, symbols, and glyphs. (It’s not even big enough for English alone.)

Unicode fundamentally serves the same purpose as ASCII, but it just encompasses a way, way, way bigger set of code points. There are a handful of encodings that emerged chronologically between ASCII and Unicode, but they are not really worth mentioning just yet because Unicode and one of its encoding schemes, UTF-8, has become so predominantly used.

Think of Unicode as a massive version of the ASCII table—one that has 1,114,112 possible code points. That’s 0 through 1,114,111, or 0 through 17 * (216) - 1, or 0x10ffff hexadecimal. In fact, ASCII is a perfect subset of Unicode. The first 128 characters in the Unicode table correspond precisely to the ASCII characters that you’d reasonably expect them to.

In the interest of being technically exacting, Unicode itself is not an encoding. Rather, Unicode is implemented by different character encodings, which you’ll see soon. Unicode is better thought of as a map (something like a dict) or a 2-column database table. It maps characters (like "a", "¢", or even "ቈ") to distinct, positive integers. A character encoding needs to offer a bit more.

Unicode contains virtually every character that you can imagine, including additional non-printable ones too. One of my favorites is the pesky right-to-left mark, which has code point 8207 and is used in text with both left-to-right and right-to-left language scripts, such as an article containing both English and Arabic paragraphs.

Note: The world of character encodings is one of many fine-grained technical details over which some people love to nitpick about. One such detail is that only 1,111,998 of the Unicode code points are actually usable, due to a couple of archaic reasons.

Unicode vs UTF-8

It didn’t take long for people to realize that all of the world’s characters could not be packed into one byte each. It’s evident from this that modern, more comprehensive encodings would need to use multiple bytes to encode some characters.

You also saw above that Unicode is not technically a full-blown character encoding. Why is that?

There is one thing that Unicode doesn’t tell you: it doesn’t tell you how to get actual bits from text—just code points. It doesn’t tell you enough about how to convert text to binary data and vice versa.

Unicode is an abstract encoding standard, not an encoding. That’s where UTF-8 and other encoding schemes come into play. The Unicode standard (a map of characters to code points) defines several different encodings from its single character set.

UTF-8 as well as its lesser-used cousins, UTF-16 and UTF-32, are encoding formats for representing Unicode characters as binary data of one or more bytes per character. We’ll discuss UTF-16 and UTF-32 in a moment, but UTF-8 has taken the largest share of the pie by far.

That brings us to a definition that is long overdue. What does it mean, formally, to encode and decode?

Encoding and Decoding in Python 3

Python 3’s str type is meant to represent human-readable text and can contain any Unicode character.

The bytes type, conversely, represents binary data, or sequences of raw bytes, that do not intrinsically have an encoding attached to it.

Encoding and decoding is the process of going from one to the other:

Encode versus decodeEncoding vs decoding (Image: Real Python)

In .encode() and .decode(), the encoding parameter is "utf-8" by default, though it’s generally safer and more unambiguous to specify it:

>>>
>>> "résumé".encode("utf-8")
b'r\xc3\xa9sum\xc3\xa9'
>>> "El Niño".encode("utf-8")
b'El Ni\xc3\xb1o'

>>> b"r\xc3\xa9sum\xc3\xa9".decode("utf-8")
'résumé'
>>> b"El Ni\xc3\xb1o".decode("utf-8")
'El Niño'

The results of str.encode() is a bytes object. Both bytes literals (such as b"r\xc3\xa9sum\xc3\xa9") and the representations of bytes permit only ASCII characters.

This is why, when calling "El Niño".encode("utf-8"), the ASCII-compatible "El" is allowed to be represented as it is, but the n with tilde is escaped to "\xc3\xb1". That messy-looking sequence represents two bytes, 0xc3 and 0xb1 in hex:

>>>
>>> " ".join(f"{i:08b}" for i in (0xc3, 0xb1))
'11000011 10110001'

That is, the character ñ requires two bytes for its binary representation under UTF-8.

Note: If you type help(str.encode), you’ll probably see a default of encoding='utf-8'. Be careful about excluding this and just using "résumé".encode(), because the default may be different in Windows prior to Python 3.6.

Python 3: All-In on Unicode

Python 3 is all-in on Unicode and UTF-8 specifically. Here’s what that means:

There is one other property that is more nuanced, which is that the default encoding to the built-in open() is platform-dependent and depends on the value of locale.getpreferredencoding():

>>>
>>> # Mac OS X High Sierra
>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'

>>> # Windows Server 2012; other Windows builds may use UTF-16
>>> import locale
>>> locale.getpreferredencoding()
'cp1252'

Again, the lesson here is to be careful about making assumptions when it comes to the universality of UTF-8, even if it is the predominant encoding. It never hurts to be explicit in your code.

One Byte, Two Bytes, Three Bytes, Four

A crucial feature is that UTF-8 is a variable-length encoding. It’s tempting to gloss over what this means, but it’s worth delving into.

Think back to the section on ASCII. Everything in extended-ASCII-land demands at most one byte of space. You can quickly prove this with the following generator expression:

>>>
>>> all(len(chr(i).encode("ascii")) == 1 for i in range(128))
True

UTF-8 is quite different. A given Unicode character can occupy anywhere from one to four bytes. Here’s an example of a single Unicode character taking up four bytes:

>>>
>>> ibrow = "🤨"
>>> len(ibrow)
1
>>> ibrow.encode("utf-8")
b'\xf0\x9f\xa4\xa8'
>>> len(ibrow.encode("utf-8"))
4

>>> # Calling list() on a bytes object gives you
>>> # the decimal value for each byte
>>> list(b'\xf0\x9f\xa4\xa8')
[240, 159, 164, 168]

This is a subtle but important feature of len():

The table below summarizes what general types of characters fit into each byte-length bucket:

Decimal Range Hex Range What’s Included Examples
0 to 127 "\u0000" to "\u007F" U.S. ASCII "A", "\n", "7", "&"
128 to 2047 "\u0080" to "\u07FF" Most Latinic alphabets* "ę", "±", "ƌ", "ñ"
2048 to 65535 "\u0800" to "\uFFFF" Additional parts of the multilingual plane (BMP)** "ത", "ᄇ", "ᮈ", "‰"
65536 to 1114111 "\U00010000" to "\U0010FFFF" Other*** "𝕂", "𐀀", "😓", "🂲",

*Such as English, Arabic, Greek, and Irish
**A huge array of languages and symbols—mostly Chinese, Japanese, and Korean by volume (also ASCII and Latin alphabets)
***Additional Chinese, Japanese, Korean, and Vietnamese characters, plus more symbols and emojis

Note: In the interest of not losing sight of the big picture, there is an additional set of technical features of UTF-8 that aren’t covered here because they are rarely visible to a Python user.

For instance, UTF-8 actually uses prefix codes that indicate the number of bytes in a sequence. This enables a decoder to tell what bytes belong together in a variable-length encoding, and lets the first byte serve as an indicator of the number of bytes in the coming sequence.

Wikipedia’s UTF-8 article does not shy away from technical detail, and there is always the official Unicode Standard for your reading enjoyment as well.

What About UTF-16 and UTF-32?

Let’s get back to two other encoding variants, UTF-16 and UTF-32.

The difference between these and UTF-8 is substantial in practice. Here’s an example of how major the difference is with a round-trip conversion:

>>>
>>> letters = "αβγδ"
>>> rawdata = letters.encode("utf-8")
>>> rawdata.decode("utf-8")
'αβγδ'
>>> rawdata.decode("utf-16")  # 😧
'뇎닎돎듎'

In this case, encoding four Greek letters with UTF-8 and then decoding back to text in UTF-16 would produce a text str that is in a completely different language (Korean).

Glaringly wrong results like this are possible when the same encoding isn’t used bidirectionally. Two variations of decoding the same bytes object may produce results that aren’t even in the same language.

This table summarizes the range or number of bytes under UTF-8, UTF-16, and UTF-32:

Encoding Bytes Per Character (Inclusive) Variable Length
UTF-8 1 to 4 Yes
UTF-16 2 to 4 Yes
UTF-32 4 No

One other curious aspect of the UTF family is that UTF-8 will not always take up less space than UTF-16. That may seem mathematically counterintuitive, but it’s quite possible:

>>>
>>> text = "記者 鄭啟源 羅智堅"
>>> len(text.encode("utf-8"))
26
>>> len(text.encode("utf-16"))
22

The reason for this is that the code points in the range U+0800 through U+FFFF (2048 through 65535 in decimal) take up three bytes in UTF-8 versus only two in UTF-16.

I’m not by any means recommending that you jump aboard the UTF-16 train, regardless of whether or not you operate in a language whose characters are commonly in this range. Among other reasons, one of the strong arguments for using UTF-8 is that, in the world of encoding, it’s a great idea to blend in with the crowd.

Not to mention, it’s 2019: computer memory is cheap, so saving 4 bytes by going out of your way to use UTF-16 is arguably not worth it.

Python’s Built-In Functions

You’ve made it through the hard part. Time to use what you’ve seen thus far in Python.

Python has a group of built-in functions that relate in some way to numbering systems and character encoding:

These can be logically grouped together based on their purpose:

Here’s a more detailed look at each of these nine functions:

Function Signature Accepts Return Type Purpose
ascii() ascii(obj) Varies str ASCII only representation of an object, with non-ASCII characters escaped
bin() bin(number) number: int str Binary representation of an integer, with the prefix "0b"
bytes() bytes(iterable_of_ints)

bytes(s, enc[, errors])

bytes(bytes_or_buffer)

bytes([i])
Varies bytes Coerce (convert) the input to bytes, raw binary data
chr() chr(i) i: int

i>=0

i<=1114111
str Convert an integer code point to a single Unicode character
hex() hex(number) number: int str Hexadecimal representation of an integer, with the prefix "0x"
int() int([x])

int(x, base=10)
Varies int Coerce (convert) the input to int
oct() oct(number) number: int str Octal representation of an integer, with the prefix "0o"
ord() ord(c) c: str

len(c) == 1
int Convert a single Unicode character to its integer code point
str() str(object=’‘)

str(b[, enc[, errors]])
Varies str Coerce (convert) the input to str, text

You can expand the section below to see some examples of each function.

ascii() gives you an ASCII-only representation of an object, with non-ASCII characters escaped:

>>>
>>> ascii("abcdefg")
"'abcdefg'"

>>> ascii("jalepeño")
"'jalepe\\xf1o'"

>>> ascii((1, 2, 3))
'(1, 2, 3)'

>>> ascii(0xc0ffee)  # Hex literal (int)
'12648430'

bin() gives you a binary representation of an integer, with the prefix "0b":

>>>
>>> bin(0)
'0b0'

>>> bin(400)
'0b110010000'

>>> bin(0xc0ffee)  # Hex literal (int)
'0b110000001111111111101110'

>>> [bin(i) for i in [1, 2, 4, 8, 16]]  # `int` + list comprehension
['0b1', '0b10', '0b100', '0b1000', '0b10000']

bytes() coerces the input to bytes, representing raw binary data:

>>>
>>> # Iterable of ints
>>> bytes((104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100))
b'hello world'

>>> bytes(range(97, 123))  # Iterable of ints
b'abcdefghijklmnopqrstuvwxyz'

>>> bytes("real 🐍", "utf-8")  # String + encoding
b'real \xf0\x9f\x90\x8d'

>>> bytes(10)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

>>> bytes.fromhex('c0 ff ee')
b'\xc0\xff\xee'

>>> bytes.fromhex("72 65 61 6c 70 79 74 68 6f 6e")
b'realpython'

chr() converts an integer code point to a single Unicode character:

>>>
>>> chr(97)
'a'

>>> chr(7048)
'ᮈ'

>>> chr(1114111)
'\U0010ffff'

>>> chr(0x10FFFF)  # Hex literal (int)
'\U0010ffff'

>>> chr(0b01100100)  # Binary literal (int)
'd'

hex() gives the hexadecimal representation of an integer, with the prefix "0x":

>>>
>>> hex(100)
'0x64'

>>> [hex(i) for i in [1, 2, 4, 8, 16]]
['0x1', '0x2', '0x4', '0x8', '0x10']

>>> [hex(i) for i in range(16)]
['0x0', '0x1', '0x2', '0x3', '0x4', '0x5', '0x6', '0x7',
 '0x8', '0x9', '0xa', '0xb', '0xc', '0xd', '0xe', '0xf']

int() coerces the input to int, optionally interpreting the input in a given base:

>>>
>>> int(11.0)
11

>>> int('11')
11

>>> int('11', base=2)
3

>>> int('11', base=8)
9

>>> int('11', base=16)
17

>>> int(0xc0ffee - 1.0)
12648429

>>> int.from_bytes(b"\x0f", "little")
15

>>> int.from_bytes(b'\xc0\xff\xee', "big")
12648430

ord() converts a single Unicode character to its integer code point:

>>>
>>> ord("a")
97

>>> ord("ę")
281

>>> ord("ᮈ")
7048

>>> [ord(i) for i in "hello world"]
[104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100]

str() coerces the input to str, representing text:

>>>
>>> str("str of string")
'str of string'

>>> str(5)
'5'

>>> str([1, 2, 3, 4])  # Like [1, 2, 3, 4].__str__(), but use str()
'[1, 2, 3, 4]'

>>> str(b"\xc2\xbc cup of flour", "utf-8")
'¼ cup of flour'

>>> str(0xc0ffee)
'12648430'

Python String Literals: Ways to Skin a Cat

Rather than using the str() constructor, it’s commonplace to type a str literally:

>>>
>>> meal = "shrimp and grits"

That may seem easy enough. But the interesting side of things is that, because Python 3 is Unicode-centric through and through, you can “type” Unicode characters that you probably won’t even find on your keyboard. You can copy and paste this right into a Python 3 interpreter shell:

>>>
>>> alphabet = 'αβγδεζηθικλμνξοπρςστυφχψ'
>>> print(alphabet)
αβγδεζηθικλμνξοπρςστυφχψ

Besides placing the actual, unescaped Unicode characters in the console, there are other ways to type Unicode strings as well.

One of the densest sections of Python’s documentation is the portion on lexical analysis, specifically the section on string and bytes literals. Personally, I had to read this section about one, two, or maybe nine times for it to really sink in.

Part of what it says is that there are up to six ways that Python will allow you to type the same Unicode character.

The first and most common way is to type the character itself literally, as you’ve already seen. The tough part with this method is finding the actual keystrokes. That’s where the other methods for getting and representing characters come into play. Here’s the full list:

Escape Sequence Meaning How To Express "a"
"\ooo" Character with octal value ooo "\141"
"\xhh" Character with hex value hh "\x61"
"\N{name}" Character named name in the Unicode database "\N{LATIN SMALL LETTER A}"
"\uxxxx" Character with 16-bit (2-byte) hex value xxxx "\u0061"
"\Uxxxxxxxx" Character with 32-bit (4-byte) hex value xxxxxxxx "\U00000061"

Here’s some proof and validation of the above:

>>>
>>> (
...     "a" ==
...     "\x61" == 
...     "\N{LATIN SMALL LETTER A}" ==
...     "\u0061" ==
...     "\U00000061"
... )
True

Now, there are two main caveats:

  1. Not all of these forms work for all characters. The hex representation of the integer 300 is 0x012c, which simply isn’t going to fit into the 2-hex-digit escape code "\xhh". The highest code point that you can squeeze into this escape sequence is "\xff" ("ÿ"). Similarly for "\ooo", it will only work up to "\777" ("ǿ").

  2. For \xhh, \uxxxx, and \Uxxxxxxxx, exactly as many digits are required as are shown in these examples. This can throw you for a loop because of the way that Unicode tables conventionally display the codes for characters, with a leading U+ and variable number of hex characters. They key is that Unicode tables most often do not zero-pad these codes.

For instance, if you consult unicode-table.com for information on the Gothic letter faihu (or fehu), "𐍆", you’ll see that it is listed as having the code U+10346.

How do you put this into "\uxxxx" or "\Uxxxxxxxx"? Well, you can’t fit it in "\uxxxx" because it’s a 4-byte character, and to use "\Uxxxxxxxx" to represent this character, you’ll need to left-pad the sequence:

>>>
>>> "\U00010346"
'𐍆'

This also means that the "\Uxxxxxxxx" form is the only escape sequence that is capable of holding any Unicode character.

Note: Here’s a short function to convert strings that look like "U+10346" into something Python can work with. It uses str.zfill():

>>>
>>> def make_uchr(code: str):
...     return chr(int(code.lstrip("U+").zfill(8), 16))
>>> make_uchr("U+10346")
'𐍆'
>>> make_uchr("U+0026")
'&'

Other Encodings Available in Python

So far, you’ve seen four character encodings:

  1. ASCII
  2. UTF-8
  3. UTF-16
  4. UTF-32

There are a ton of other ones out there.

One example is Latin-1 (also called ISO-8859-1), which is technically the default for the Hypertext Transfer Protocol (HTTP), per RFC 2616. Windows has its own Latin-1 variant called cp1252.

Note: ISO-8859-1 is still very much present out in the wild. The requests library follows RFC 2616 “to the letter” in using it as the default encoding for the content of an HTTP/HTTPS response. If the word “text” is found in the Content-Type header, and no other encoding is specified, then requests will use ISO-8859-1.

The complete list of accepted encodings is buried way down in the documentation for the codecs module, which is part of Python’s Standard Library.

There’s one more useful recognized encoding to be aware of, which is "unicode-escape". If you have a decoded str and want to quickly get a representation of its escaped Unicode literal, then you can specify this encoding in .encode():

>>>
>>> alef = chr(1575)  # Or "\u0627"
>>> alef_hamza = chr(1571)  # Or "\u0623"
>>> alef, alef_hamza
('ا', 'أ')
>>> alef.encode("unicode-escape")
b'\\u0627'
>>> alef_hamza.encode("unicode-escape")
b'\\u0623'

You Know What They Say About Assumptions…

Just because Python makes the assumption of UTF-8 encoding for files and code that you generate doesn’t mean that you, the programmer, should operate with the same assumption for external data.

Let’s say that again because it’s a rule to live by: when you receive binary data (bytes) from a third party source, whether it be from a file or over a network, the best practice is to check that the data specifies an encoding. If it doesn’t, then it’s on you to ask.

All I/O happens in bytes, not text, and bytes are just ones and zeros to a computer until you tell it otherwise by informing it of an encoding.

Here’s an example of where things can go wrong. You’re subscribed to an API that sends you a recipe of the day, which you receive in bytes and have always decoded using .decode("utf-8") with no problem. On this particular day, part of the recipe looks like this:

>>>
>>> data = b"\xbc cup of flour"

It looks as if the recipe calls for some flour, but we don’t know how much:

>>>
>>> data.decode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

Uh oh. There’s that pesky UnicodeDecodeError that can bite you when you make assumptions about encoding. You check with the API host. Lo and behold, the data is actually sent over encoded in Latin-1:

>>>
>>> data.decode("latin-1")
'¼ cup of flour'

There we go. In Latin-1, every character fits into a single byte, whereas the “¼” character takes up two bytes in UTF-8 ("\xc2\xbc").

The lesson here is that it can be dangerous to assume the encoding of any data that is handed off to you. It’s usually UTF-8 these days, but it’s the small percentage of cases where it’s not that will blow things up.

If you really do need to abandon ship and guess an encoding, then have a look at the chardet library, which uses methodology from Mozilla to make an educated guess about ambiguously encoded text. That said, a tool like chardet should be your last resort, not your first.

Odds and Ends: unicodedata

We would be remiss not to mention unicodedata from the Python Standard Library, which lets you interact with and do lookups on the Unicode Character Database (UCD):

>>>
>>> import unicodedata

>>> unicodedata.name("€")
'EURO SIGN'
>>> unicodedata.lookup("EURO SIGN")
'€'

Wrapping Up

In this article, you’ve decoded the wide and imposing subject of character encoding in Python.

You’ve covered a lot of ground here:

Now, go forth and encode!

Resources

For even more detail about the topics covered here, check out these resources:

The Python docs have two pages on the subject:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 20, 2019 02:00 PM UTC


Catalin George Festila

Python 3.7.3 : The google-cloud-vision python module - part 002.

I used Windows 8.1 and python 3.7.3 version. The first step is to install the python module. C:\Python373\Scripts>pip install --upgrade google-cloud-visionYou can see another tutorial about this python module here. Let's test the python module. C:\Python373>python Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD6 4)] on win32 Type "help", "copyright", "credits" or

May 20, 2019 01:20 PM UTC


Abhijeet Pal

Adding Pagination With Django

While working on a modern web application quite often you will need to paginate the app be it for better user experience or performance. Fortunately, Django comes with built-in pagination classes for managing paginating data of your application.

In this article, we will go through the pagination process with class-based views and function based views in Django.

Prerequisite

For the sake of this tutorial I am using a blog application  – Github repo

The above project is made on Python 3.7, Django 2.1 and Bootstrap 4.3. This is a very basic blog application displaying a list of posts on the homepage but when the number of posts increases we need to split them up.

Recommended Article:  Building A Blog Application With Django

Adding Pagination Using Class-Based-Views [ ListView ]

The Django ListView class comes with built-in support for pagination so all we need to do is take advantage of it. Pagination is controlled by the GET parameter that controls which page to show.

First, open the views.py file of your app.

from django.views import generic
from .models import Post


class PostList(generic.ListView):
    queryset = Post.objects.filter(status=1).order_by('-created_on')
    template_name = 'index.html'

class PostDetail(generic.DetailView):
    model = Post
    template_name = 'post_detail.html'

Now in the PostList view we will introduce a new attribute paginate_by which takes an integer specifying how many objects should be displayed per page. If this is given, the view will paginate objects with paginate_by objects per page. The view will expect either a page query string parameter (via request.GET) or a page variable specified in the URLconf.

class PostList(generic.ListView):
    queryset = Post.objects.filter(status=1).order_by('-created_on')
    template_name = 'index.html'
    paginate_by = 3

Now our posts are paginated by 3 posts a page.

Next, to see the pagination in action, we need to edit the template which for this application is the index.html file paste the below snippet.

{% if is_paginated %}
  <nav aria-label="Page navigation conatiner"></nav>
  <ul class="pagination justify-content-center">
    {% if page_obj.has_previous %}
    <li><a href="?page={{ page_obj.previous_page_number }}" class="page-link">&laquo; PREV </a></li>
    {% endif %}
    {% if page_obj.has_next %}
    <li><a href="?page={{ page_obj.next_page_number }}" class="page-link"> NEXT &raquo;</a></li>

    {% endif %}
  </ul>
  </nav>
</div>
{% endif %}

Note that we are using Bootstrap 4.3 for this project, if you are using any other frontend framework you may change the classes.

Now run the server and visit http://127.0.0.1:8000/ you should see the page navigation buttons below the posts.

Adding pagination with class based views in django

 

Adding Pagination Using Function-Based-Views

Equivalent function based view for the above PosList class would be.

def PostList(request):
    object_list = Post.objects.filter(status=1).order_by('-created_on')
    paginator = Paginator(object_list, 3)  # 3 posts in each page
    page = request.GET.get('page')
    try:
        post_list = paginator.page(page)
    except PageNotAnInteger:
            # If page is not an integer deliver the first page
        post_list = paginator.page(1)
    except EmptyPage:
        # If page is out of range deliver last page of results
        post_list = paginator.page(paginator.num_pages)
    return render(request,
                  'index.html',
                  {'page': page,
                   'post_list': post_list})

So in the view, we instantiate the Paginator class with the number of objects to be displayed on each page i.e 3. Then we have the request.GET.get('page') parameter which returns the current page number. The page() method is used to obtain the objects from the desired page number. Below that we have two exception statements for PageNotAnInteger and EmptyPage both are subclasses of InvalidPage finally at the end we are rendering out the HTML file.

Now in your templates paste the below snippet.

{% if post_list.has_other_pages %}
  <nav aria-label="Page navigation conatiner"></nav>
  <ul class="pagination justify-content-center">
    {% if post_list.has_previous %}
    <li><a href="?page={{ post_list.previous_page_number }}" class="page-link">&laquo; PREV </a></li>
    {% endif %}
    {% if post_list.has_next %}
    <li><a href="?page={{ post_list.next_page_number }}" class="page-link"> NEXT &raquo;</a></li>
  </ul>
  </nav>
  {% endif %}

Save the files and run the server you should see the NEXT button below the post list.

Adding Pagination Using Function-Based-Views

The post Adding Pagination With Django appeared first on Django Central.

May 20, 2019 11:39 AM UTC


EuroPython Society

EuroPython 2019: Conference and training ticket sale opens today

europython:

We will be starting the EuroPython 2019 conference and training ticket sales

today (Monday) at 12:00 CEST.

image

https://ep2019.europython.eu/registration/buy-tickets/


Only 300 training tickets available

After the rush to the early-bird tickets last week (we sold more than 290 tickets in 10 minutes), we expect a rush to the regular and training tickets this week as well.

We only have 300 training tickets available, so if you want to attend the training days, please consider getting your ticket soon.

Available ticket types

We will have the following ticket types available:

  • regular conference tickets - admission to the conference days (July 10-12) and sprints (July 13-14)
  • training tickets - admission to the training days (July 8-9)
  • combined tickets - admission to training, conference and sprint days (July 8-14)

Please see our registration page for full details on the available tickets.

As reminder, here’s the conference layout:

  • Monday & Tuesday, July 8 & 9: Trainings, Beginners’ Day and other workshops
  • Wednesday–Friday, July 10–12: Conference talks, keynotes & exhibition
  • Saturday & Sunday, July 13 & 14: Sprints

Combined Tickets

These are a new ticket type we are introducing for EuroPython 2019, to simplify purchase and check-in at the conference for attendees who want to attend the complete EuroPython 2019 week with a single ticket.

To make the ticket more attractive, we are granting a small discount compared to purchasing training and conference tickets separately.


Enjoy,

EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/

May 20, 2019 07:47 AM UTC


EuroPython

EuroPython 2019: Conference and training ticket sale opens today

We will be starting the EuroPython 2019 conference and training ticket sales

today (Monday) at 12:00 CEST.

image

https://ep2019.europython.eu/registration/buy-tickets/


Only 300 training tickets available

After the rush to the early-bird tickets last week (we sold more than 290 tickets in 10 minutes), we expect a rush to the regular and training tickets this week as well.

We only have 300 training tickets available, so if you want to attend the training days, please consider getting your ticket soon.

Available ticket types

We will have the following ticket types available:

Please see our registration page for full details on the available tickets.

As reminder, here’s the conference layout:

Combined Tickets

These are a new ticket type we are introducing for EuroPython 2019, to simplify purchase and check-in at the conference for attendees who want to attend the complete EuroPython 2019 week with a single ticket.

To make the ticket more attractive, we are granting a small discount compared to purchasing training and conference tickets separately.


Enjoy,

EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/

May 20, 2019 07:14 AM UTC


Mike Driscoll

PyDev of the Week: Adrienne Tacke

This week we welcome Adrienne Tacke (@AdrienneTacke) as our PyDev of the Week! Adrienne is the author of Coding for Kids: Python: Learn to Code with 50 Awesome Games and Activities and her book came out earlier this year. You can see what Adrienne is up to on Instagram or via her website. Let’s take some time to get to know her better!

Can you tell us a little about yourself (hobbies, education, etc):

I’m a software engineer in Las Vegas and have a degree in Management Information Systems from UNLV. I’ve worked in the education and healthcare industries and now focus on building awesome things in the fintech space. I love learning new languages (spoken and programming), eating every dessert imaginable, traveling the world with my husband, and finding ways to encourage more young girls and women to try out a career as a software engineer.

Why did you start using Python?

I truly began using the language when I started writing my book Coding for Kids: Python. It was a simple language that allowed me to focus on the programming concepts versus the syntax which is important for anyone new to coding.

What other programming languages do you know and which is your favorite?

My main languages are C# and JavaScript. I use these everyday for work. My first language was vb.net *blushes*. I don’t really have a favorite language, but I do strongly believe in some principles. These include separation of concerns, reusability, human-readable > machine-readable code, and clean code and design.

What projects are you working on now?

I have several projects that involve me teaching something related to software engineering. I can’t wait to share more in the coming months! As far as code, I just (FINALLY) updated my website which has been in dire need of a refresh. And the project I’m most excited about: I am starting the migration for our company’s client-side architecture to React!

Which Python libraries are your favorite (core or 3rd party)?

I’m don’t have any favorites yet as I’ve barely scratched the surface of what’s available. That said, I have tried TensorFlow and Luminoth and absolutely love their potential. I hope to work with them more in a future weekend project!

How did your book come about?

I was approached by a publisher with this opportunity through my Instagram account! I share peeks into my career, serve as an example of a feminine software engineer, and more importantly, share educational posts and mini-tutorials. One of my series is called #DontBeAfraidOfTheTerminal where I explain common and widely used commands in an approachable way. I think they liked they way I taught technical topics, so they asked me to write this book!

What sorts of things did you learn while writing your book?

It is very difficult to reduce topics down into a manageable and kid-friendly way! Writing this book for this type of audience (kids or someone with absolutely no programming experience) was a welcome challenge as it required me to rephrase and refine how I explain things work in programming.

Do you have advice for other aspiring authors?

If you have to re-read something you wrote, start over or find a way to make the sentence flow better. More importantly, just start! The sooner you get your ideas into a draft, the faster you can refine and polish it to become a valuable piece of work. 🙂

Is there anything else you’d like to say?

I just want to remind everyone that there is no developer uniform. Knowledge is power and if you know your stuff, it shouldn’t matter what you look like!

Thanks for doing the interview, Adrienne!

The post PyDev of the Week: Adrienne Tacke appeared first on The Mouse Vs. The Python.

May 20, 2019 05:05 AM UTC


Podcast.__init__

Hardware Hacking Made Easy With CircuitPython

Learning to program can be a frustrating process, because even the simplest code relies on a complex stack of other moving pieces to function. When working with a microcontroller you are in full control of everything so there are fewer concepts that need to be understood in order to build a functioning project. CircuitPython is a platform for beginner developers that provides easy to use abstractions for working with hardware devices. In this episode Scott Shawcroft explains how the project got started, how it relates to MicroPython, some of the cool ways that it is being used, and how you can get started with it today. If you are interested in playing with low cost devices without having to learn and use C then give this a listen and start tinkering!

Summary

Learning to program can be a frustrating process, because even the simplest code relies on a complex stack of other moving pieces to function. When working with a microcontroller you are in full control of everything so there are fewer concepts that need to be understood in order to build a functioning project. CircuitPython is a platform for beginner developers that provides easy to use abstractions for working with hardware devices. In this episode Scott Shawcroft explains how the project got started, how it relates to MicroPython, some of the cool ways that it is being used, and how you can get started with it today. If you are interested in playing with low cost devices without having to learn and use C then give this a listen and start tinkering!

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Scott Shawcroft about CircuitPython, the easiest way to program microcontrollers

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what CircuitPython is and how the project got started?
    • I understand that you work at Adafruit and I know that a number of their products support CircuitPython. What other runtimes do you support?
  • Microcontrollers have typically been the domain of C because of the resource and performance constraints. What are the benefits of using Python to program hardware devices?
  • With the wide availability of powerful computing platforms, what are the benefits of experimenting with microcontrollers and their peripherals?
  • I understand that CircuitPython is a friendly fork of MicroPython. What have you changed in your version?
    • How do you structure your development to avoid conflicts with the upstream project?
    • What are some changes that you have contributed back to MicroPython?
  • What are some of the features of CircuitPython that make it easier for users to interact with sensors, motors, etc.?
  • CircuitPython provides an easy on-ramp for experimenting with hardware projects. Is there a point where a user will outgrow it and need to move to a different language or framework?
  • What are some of the most interesting/innovative/unexpected projects that you have seen people build using CircuitPython?
    • Are there any cases of someone building and shipping a production grade project in CircuitPython?
  • What have been some of the most interesting/challenging/unexpected aspects of building and maintaining CircuitPython?
  • What is in store for the future of the project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

May 20, 2019 02:29 AM UTC


Wingware Blog

Selecting Logical Units of Python Code in Wing

In this issue of Wing Tips we take a look at quickly selecting Python code in logical units, which can make some editing tasks easier.

Select More and Select Less

The easiest way to select code from the keyboard, starting from the current selection or caret position, is to repeatedly press Ctrl-Up (the Control key together with the up arrow key). Wing selects more and more code, working outward in logical units, as follows:

/images/blog/quick-select/quick-select-1.gif

Press Ctrl-Up repeatedly to select increasingly larger units of Python code

If you select too much, pressing Ctrl-Down instead reduces the selection size again:

/images/blog/quick-select/quick-select-2.gif

Press Ctrl-Down repeatedly to return to selecting smaller units of Python code

Select Statement, Block or Scope

Wing also provides commands for selecting the current, previous, or next Statement (a single logical line of code that may span multiple physical lines), Block (a contiguous section of code at same indent level, without any blank lines), or Scope (an entire function, method, or class).

Here's an example using several of these commands:

/images/blog/quick-select/quick-select-3.gif

Execute "Select Statement", then "Select Block", "Select Scope", and finally "Select Next Scope"

Adding Key Bindings

If you plan to use these commands, you will probably want to bind them to keys using the User Interface > Keyboard > Custom Key Bindings preference for the following commands:

select-statement
next-statement
previous-statement
select-block
next-block
previous-block
select-scope
next-scope
previous-scope

Since free key combinations are often in short supply, you may want to make use of a multi-key sequence in your bindings. For example, pressing Ctrl-\ followed by B results in the binding Ctrl-\ B:

/images/blog/quick-select/quick-select-4.gif

Add a key binding "Ctrl-\ B" for "select-block"

This only works if Ctrl-\ is not itself already a binding. Any other free key combination (and not only Ctrl-\) can be used as the starting key in the sequence.



That's it for now! We'll be back next week with more Wing Tips for Wing Python IDE.

May 20, 2019 01:00 AM UTC

May 19, 2019


Codementor

Python's Counter - Part 1

This article provides an introduction to Python's in-built 'Counter' tool, and is part of a series of articles on some often ignored, or undiscovered, in-built python libraries.

May 19, 2019 01:41 PM UTC


Weekly Python StackOverflow Report

(clxxviii) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2019-05-19 10:39:17 GMT


  1. Slice a list based on an index and items behind it in Python - [20/16]
  2. pylint protection again self-assignment - [14/2]
  3. Filter a data-frame and add a new column according to the given condition - [11/5]
  4. Understanding Python syntax in lists vs series - [9/3]
  5. How to subset row of condition with some of N rows before the condition meet , more faster than my code? - [8/1]
  6. How to generate a time-ordered uid in Python? - [7/2]
  7. Creating an order-preserving multi-value dict for Django - [7/1]
  8. Update elements of dataframe by applying function involving same row elements - [6/6]
  9. Construct new column with first row of a groupby with two columns - Pandas - [6/4]
  10. Do JavaScript classes have a method equivalent to Python classes' __call__? - [6/3]

May 19, 2019 10:39 AM UTC

May 18, 2019


Python Software Foundation

The 2019 Python Language Summit

The Python Language Summit is a small gathering of Python language implementers, both the core developers of CPython and alternative Pythons, held on the first day of PyCon. The summit features short presentations from Python developers and community members, followed by longer discussions. The 2019 summit is the first held since Guido van Rossum stepped down as Benevolent Dictator for Life, replaced by a five-member Steering Council.

LWN.net covered the proceedings from 2015 to 2018; this year the PSF has chosen to feature summit coverage on its own blog, written by A. Jesse Jiryu Davis.

Individual writeups will be linked from this page as they are posted.

Sessions:

Lightning Talks, Round 1 (pre-selected lightning talks)


Writing Stdlib C Extension Modules in Python? Jukka Lehtosalo and Michael Sullivan

Async REPL and Async-Exec, Mattias Bussonnier

The Night’s Watch Is Fixing the CIs in the Darkness for You, Pablo Galindo Salgado

Asyncio and the Case for Re-Entrancy, Jason Fried

Optimising CPython, or Not, Mark Shannon

The Night’s Watch Is Fixing the CIs in the Darkness for You, Pablo Galindo Salgado

Black Under github.com/python, Łukasz Langa

Half-Hour Sessions


Python on Other Platforms, Russell Keith-Magee

Time Zones in the Standard Library, Paul Ganssle

Batteries Included, but They’re Leaking, Amber Brown

History of CircuitPython, Scott Shawcroft

Dynamic Extension Module Objects, Petr Viktorin

PEP 581 / PEP 588, Mariatta Wijaya

Feedback on Mentoring Python Contributors, Victor Stinner

Lightning Talks, Round 2 (on-site signup)


SSL Module Updates, Christian Heimes

Let’s Argue About Clinic, Larry Hastings

The C-API, Eric Snow

Python in the Windows Store, Steve Dower

Bors: How the Rust Team Avoids Pablo’s Problems, Nathaniel Smith

Mypyc for Stdlib: Extended Discussion, Michael Sullivan

Python Core Sprints at Bloomberg in September, Pablo Galindo

Status of the Stable ABI, Victor Stinner

Cognitive Encapsulation: Anchor of Working Together, Yarko Tymciurak

Thanks to Bloomberg Engineering for sponsoring the summit.




May 18, 2019 11:40 PM UTC

Scott Shawcroft: History of CircuitPython



Scott Shawcroft is a freelance software engineer working full time for Adafruit, an open source hardware company that manufactures electronics that are easy to assemble and program. Shawcroft leads development of CircuitPython, a Python interpreter for small devices.

The presentation began with a demo of Adafruit’s Circuit Playground Express, a two-inch-wide circular board with a microcontroller, ten RGB lights, a USB port, and other components. Shawcroft connected the board to his laptop with a USB cable and it appeared as a regular USB drive with a source file called code.py. He edited the source file on his laptop to dim the brightness of the board’s lights. When he saved the file, the board automatically reloaded the code and the lights dimmed. “So that's super quick,” said Shawcroft. “I just did the demo in three minutes.”

Read more 2019 Python Language Summit coverage.

CircuitPython Is Optimized For Learning Electronics

The history of CircuitPython begins with MicroPython, a Python interpreter written from scratch for embedded systems by Damien George starting in 2013. Three years later, Adafruit hired Shawcroft to port MicroPython to the SAMD21 chip they use on many of their boards. Shawcroft’s top priority was serial and USB support for Adafruit’s boards, and then to implement communication with a variety of sensors. “The more hardware you can support externally,” he said, “the more projects people can build.”

As Shawcroft worked with MicroPython’s hardware APIs, he found them ill-fitting for Adafruit’s goals. MicroPython customizes its hardware APIs for each chip family to provide speed and flexibility for hardware experts. Adafruit’s audience, however, is first-time coders. Shawcroft said, “Our goal is to focus on the first five minutes someone has ever coded.”

To build a Python for Adafruit’s needs, Shawcroft forked MicroPython and created a new project, CircuitPython. In his Language Summit talk, he emphasized it is a “friendly fork”: both projects are MIT-licensed and share improvements in both directions. In contrast to MicroPython’s hardware APIs that vary by chip, CircuitPython has one hardware API, allowing Adafruit to write one set of libraries for them all.

MicroPython has a distinct standard library that differs from CPython’s: for example, its time functions are in a module named utime with a different feature set from the standard time module. It also ships modules with features not found in CPython’s standard library, such as advanced filesystem management features. In CircuitPython, Shawcroft removed the nonstandard features and modules. This change helps new coders ramp smoothly from CircuitPython on a microcontroller to CPython on a full-size computer, and it makes Adafruit’s libraries reusable on CPython itself.

Another motive for forking was to create a separate community for CircuitPython. In the original MicroPython project’s community, Shawcroft said, “There are great folks, and there's some not-so-great folks.” The CircuitPython community welcomes beginners, publishes documentation suitable for them, and maintains standards of conduct that are safe for minors.

Audience members were curious about CircuitPython’s support for Python 3.8 and beyond. When Damien George began MicroPython he targeted Python 3.4 compliance, which CircuitPython inherits. Shawcroft said that MicroPython has added some newer Python features, and decisions about more language features rest with Damien George.

Minimal Barrier To Entry



Photo courtesy of Adafruit.

Shawcroft aims to remove all roadblocks for beginners to be productive with CircuitPython. As he demonstrated, CircuitPython auto-reloads and runs code when the user saves it; there are two more user experience improvements in the latest release. First, serial output is shown on a connected display, so a program like print("hello world") will have visible output even before the coder learns how to control LEDs or other observable effects.

Second, error messages are now translated into nine languages, and Shawcroft encourages anyone with language skills to contribute more. Guido van Rossum and A. Jesse Jiryu Davis were excited to see these translations and suggested contributing them to CPython. Shawcroft noted that the existing translations are MIT-licensed and can be ported; however, the translations do not cover all the messages yet, and CircuitPython cannot show messages in non-Latin characters such as Chinese. Chinese fonts are several megabytes of characters, so the size alone presents an unsolved problem.

Later this year, Shawcroft will add Bluetooth support for coders to connect their phone or tablet to an Adafruit board and enjoy the same quick edit-refresh cycle there. Touchscreens will require a different sort of code editor, perhaps more like EduBlocks. Despite the challenges, Shawcroft echoed Russell Keith-Magee’s insistence on the value of mobile platforms: “My nieces, they have tablets and phones. They do not have laptops.”

Shawcroft’s sole request for the core developers was to keep new language features simple, with few special cases. First, because each new CPython feature must be reimplemented in MicroPython and CircuitPython, and special cases make this work thorny. Second, because complex logic translates into large code size, and the space for code on microcontrollers is minuscule.


May 18, 2019 10:58 PM UTC

Amber Brown: Batteries Included, But They're Leaking



Amber Brown of the Twisted project shared her criticisms of the Python standard library. This proved to be the day’s most controversial talk; Guido van Rossum stormed from the room during Q & A.

Read more 2019 Python Language Summit coverage.

Applications Need More Than The Standard Library

Python claims to ship with batteries included, but according to Brown, without external packages it is only “marginally useful.” For example, asyncio requires external libraries to connect to a database or to speak HTTP. Brown asserted that there were many such dependencies from the standard library to PyPI: typing works best with mypy, the ssl module requires a monkeypatch to connect to non-ASCII domain names, datetime needs pytz, and six is non-optional for writing code for Python 2 and 3.

Other standard library modules are simply inferior to alternatives on PyPI. The http.client documentation advises readers to use Requests, and the datetime module is confusing compared to its competitors such as arrow, dateutil, and moment.

Poor Quality, Lagging Features, And Obsolete Code


“Python's batteries are leaking,” said Brown. She thinks that some bugs in the standard library will never be fixed. And even when bugs are fixed, PyPI libraries like Twisted cannot assume they run on the latest Python, so they must preserve their bug workarounds forever.

There are many modules that few applications use, but there is no method to install a subset of the standard library. Brown called out the XML parser and tkinter in particular for making the standard library larger and harder to build, burdening all programmers for the sake of a few. As Russell Keith-Magee had described earlier in the day, the size of the standard library makes it difficult for PyBee to run Python on constrained devices. Brown also noted that some standard library modules were optimized in C for Python 3, but had to be reimplemented in pure Python for PyPy to support them.

Brown identified new standard library features that were “too little, too late,” leaving users to depend on backports to use those features in Python 2. For example, socket.sendmsg was added only recently, meaning Twisted must ship its own C extension to use sendmsg in Python 2. Although Python 2 is nearly at its end of life, this only holds for the core developers, according to Brown, and for users, Red Hat and other distributors will keep Python 2 alive “until the goddam end of time.” Brown also mentioned that some itertools code is shown as examples in the documentation instead of shipped as functions in the itertools module.

Guido van Rossum, sitting at the back of the room, interrupted at this moment, “Can you keep to one topic? I'm sorry but this is just one long winding rant. What is your point?” Brown responded that her point was that there are a multitude of problems in the standard library.

Standard Library Modules Crowd Out Innovation


Brown’s most controversial opinion, in her own estimation, is that adding modules to the standard library stifles innovation, by discouraging programmers from using or contributing to competing PyPI packages. Ever since asyncio was announced she has had to explain why Twisted is still worthwhile, and now that data classes are in the standard library Hynek Schlawack must defend his attrs package. Even as standard library modules crowd out other projects, they lag behind them. According to Brown, “the standard library is where code sometimes goes to die,” because it is difficult and slow to contribute code there. She acknowledged recent improvements, from Mariatta Wijaya’s efforts in particular, but Python is still harder to contribute to than PyPI packages.

“So I know a lot of this is essentially a rant,” she concluded, “but it's fully intended to be.”

Discussion


Nick Coghlan interpreted Brown’s proposal as generalizing the “ensurepip” model to ensure some packages are always available but can be upgraded separately from the standard library, and he thought this was reasonable.

Van Rossum was less convinced. He asked again, “Amber, what is your point?” Brown said her point was to move asyncio to PyPI, along with most new feature development. “We should embrace PyPI,” she exhorted. Some ecosystems such as Javascript rely too much on packages, she conceded, but there are others like Rust that have small standard libraries and high-quality package repositories. She thinks that Python should move farther in that direction.

Van Rossum argued instead that if the Twisted team wants the ecosystem to evolve, they should stop supporting older Python versions and force users to upgrade. Brown acknowledged this point, but said half of Twisted users are still on Python 2 and it is difficult to abandon them. The debate at this point became personal for Van Rossum, and he left angrily.

Nathaniel Smith commented, “I'm noticing some tension here.” He guessed that Brown and the core team were talking past each other because the core team had different concerns from other Python programmers. Brown went further adding that because few Python core developers are also major library maintainers, library authors’ complaints are devalued or ignored.

The remaining core developers continued the technical discussion. Barry Warsaw said that the core team had discussed deprecating modules in the standard library, or creating slim distributions with a subset of it, but that it required a careful design. Others objected that slimming down the standard library risked breaking downstream code, or making work for programmers in enterprises that trust the standard library but not PyPI.

Pablo Galindo Salgado was concerned that moving modules from the standard library to PyPI would create an explosion of configurations to test, but in Brown’s opinion, “We are already living that life.” Some Linux and Python distributions have selectively backported features and fixes, leading to a much more complex set of configurations than the core team realizes.

May 18, 2019 09:45 PM UTC


Evennia

Creating Evscaperoom, part 1

Over the last month (April-May 2019) I have taken part in the Mud Coder's Guild Game Jam "Enter the (Multi-User) Dungeon". This year the theme for the jam was One Room.

The result was Evscaperoom, an text-based multi-player "escape-room" written in Python using the Evennia MU* creation system. You can play it from that link in your browser or MU*-client of choice. If you are so inclined, you can also vote for it here in the jam (don't forget to check out the other entries while you're at it).

This little series of (likely two) dev-blog entries will try to recount the planning and technical aspects of the Evscaperoom. This is also for myself - I'd better write stuff down now while it's still fresh in my mind!

Inception 

When I first heard about the upcoming game-jam's theme of One Room, an 'escape room' was the first thing that came to mind, not the least because I just recently got to solve my my own first real-world escape-room as a gift on my birthday. 

If you are not familiar with escape-rooms, the premise is simple - you are locked into a room and have to figure out a way to get out of it by solving practical puzzles and finding hidden clues in the room. 

While you could create such a thing in your own bedroom (and there are also some one-use board game variants), most escape-rooms are managed by companies selling this as an experience for small groups. You usually have one hour to escape and if you get stuck you can press a button (or similar) to get a hint.

I thought making a computer escape-room. Not only can you do things in the computer that you cannot do in the real world, restricting the game to a single room limits so that it's conceivable to actually finish the damned thing in a month. 

A concern I had was that everyone else in the jam surely must have went for the same obvious idea. In the end that was not an issue at all though.


Basic premises
 
I was pretty confident that I would technically be able to create the game in time (not only is Python and Evennia perfect for this kind of fast experimentation and prototyping, I know the engine very well). But that's not enough; I had to first decide on how the thing should actually play. Here are the questions I had to consider:

Room State 

 An escape room can be seen as going through multiple states as puzzles are solved. For example, you may open a cabinet and that may open up new puzzles to solve. This is fine in a single-player game, but how to handle it in a multi-player environment?

My first thought was that each object may have multiple states and that players could co-exist in the same room, seeing different states at the same time. I really started planning for this. It would certainly be possible to implement.

But in the end I considered how a real-world escape-room works - people in the same room solves it together. For there to be any meaning with multi-player, they must share the room state.

So what I went with was a solution where players can create their own room or join an existing one. Each such room is generated on the fly (and filled with objects etc) and will change as players solve it. Once complete and/or everyone leaves, the room is deleted along with all objects in it. Clean and tidy.

So how to describe these states? I pictured that these would be described as normal Python modules with a start- and end function that initialized each state and cleaned it up when a new state was started. In the beginning I pictured these states as being pretty small (like one state to change one thing in the room). In the end though, the entire Evscaperoom fits in 12 state modules. I'll describe them in more detail in the second part of this post. 

Accessibility and "pixel-hunting" in text

When I first started writing descriptions I didn't always note which objects where interactive. It's a very simple and tempting puzzle to add - mention an object as part of a larger description and let the player figure out that it's something they can interact with. This practice is sort-of equivalent to pixel-hunting in graphical games - sweeping with the mouse across the screen until you find that little spot on the screen that you can do something with.

Problem is, pixel-hunting's not really fun. You easily get stuck and when you eventually find out what was blocking you, you don't really feel clever but only frustrated. So I decided that I should clearly mark every object that people could interact with and focus puzzles on better things.


In fact, in the end I made it an option:

Option menu ('quit' to return)   1: ( ) No item markings (hard mode)  2: ( ) Items marked as item (with color)  3: (*) Items are marked as [item] (screenreader friendly)  4: ( ) Screenreader mode

As part of this I had to remind myself never to use colors only when marking important information: Visually impaired people with screen readers will simply miss that. Not to mention that some just disable colors in their clients.

So while I personally think option 2 above is the most visually pleasing, Evscaperoom defaults to the third option. It should should start everyone off on equal footing. Evennia has a screen-reader mode out of the box, but I moved it into the menu here for easy access.

Inventory and collaboration

In a puzzle-game, you often find objects and combine them with other things. Again, this is simple to do in a single-player game: Players just pick things up and use them later.

But in a multi-player game this offers a huge risk: players that pick up something important and then log off. The remaining players in that room would then be stuck in an unsolvable room - and it would be very hard for them to know this.

In principle you could try to 'clean' player inventories when they leave, but not only does it add complexity, there is another issue with players picking things up: It means that the person first to find/pick up the item is the only one that can use it and look at it. Others won't have access until the first player gives it up. Trusting that to anonymous players online is not a good idea.

So in the end I arrived at the following conclusions:
As a side-effect of this I also set a limit to the kind of puzzles I would allow:

Focusing on objects

So without inventory system, how do you interact with objects? A trademark of any puzzle is using one object with another and also to explore things closer to find clues. I turned to graphical adventure games for inspiration:

Hovering with mouse over lens object offers action
Secret of Monkey Island ©1990 LucasArts. Image from old-games.com

A common way to operate on an object in traditional adventure games is to hover the mouse over it and then select the action you want to apply to it. In later (3D) games you might even zoom in of the object and rotate it around with your mouse to see if there are some clues to be had.

While Evennia and modern UI clients may allow you to use the mouse to select objects, I wanted this to work the traditional MUD-way, by inserting commands. So I decided that you as a player would be in one of two states:
A small stone fireplace sits in the middle of the wall opposite the [door]. On the chimney hangs a small oil [painting] of a man. Hanging over the hearth is a black [cauldron]. The piles of [ashes] below are cold.  (It looks like fireplace may be suitable to [climb].)

In the example above, the fireplace points out other objects you could also focus on, whereas the last parenthesis includes one or more "actions" that you can perform on the fireplace only when you have it focused. 

This ends up pretty different from most traditional MUD-style inputs. When I first released this to the public, I found people logged off after their first examine. It turned out that they couldn't figure out how to leave the focus mode. So they just assumed the thing was buggy and quit instead. Of course it's mentioned if you care to write help, but this is clearly one step too many for such an important UI concept. 

So I ended up adding the header above that always reminds you. And since then I've not seen any confusion over how the focus mode works.

For making it easy to focus on things, I also decided that each room would only ever have one object named a particular thing. So there is for example only one single object in the game named "key" that you can focus on. 

Communication

I wanted players to co-exist in the same room so that they could collaborate on solving it. This meant communication must be possible. I pictured people would want to point things out and talk to each other.

In my first round of revisions I had a truckload of individual emotes; you could

      point at target

 for example. In the end I just limited it to  

     say/shout/whisper <message>

and 

     emote <whatever>

And seeing what people actually use, this is more than enough (say alone is probably 99% of what people need, really). I had a notion that the shout/whisper could be used in a puzzle later but in the end I decided that communication commands should be strictly between players and not have anything to do with the puzzles.

I removed all other interaction: There is no fighting and without an inventory or requirement to collaborate on puzzles, there is no need for other interactions than to communicate.

First version you didn't even see what the others did, but eventually I added so that you at least saw what other players were focusing on at the moment (and of course if some major thing was solved/found).

In the end I don't even list characters as objects in the room (you have to use the who command to see who's in there with you).

Listing of commands available in the Evscaperoom (output of the help-command in game)
The main help command output.
Story

It's very common for this type of game to have a dangerous or scary theme. Things like "get out before the bomb explodes", "save the space ship before the engines overheat", "flee the axe murderer before he comes back" etc). I'm no stranger to dark themes, but for this I wanted something friendlier and brighter, maybe with a some dark undercurrents here and there.

My Jester character is someone I've not only depicted in art, but she's also an old RP character and literary protagonist of mine. Who else would find it funny to lock someone into a room only to provide crazy puzzles and hints for them to get out again? So my flimsy 'premise' was this: 


The village Jester wants to win the pie eating contest. You are one of her most dangerous opponents. She tricked you to her cabin and now you are locked in! If you don't get out in time, she'll get to eat all those pies on her own and surely win!

That's it - this became the premise from which the entire game flowed. I quickly decided that it to be a very "small-scale" story: no life-or-death situation, no saving of the world. The drama takes place in a small village with an "adversary" that doesn't really want to hurt you, but only to eat more pies than you.

From this, the way to offer hints came naturally - just eat a slice of "hintberry pie" the jester made (she even encourage you to eat it). It gives you a hint but is also very filling. So if you eat too much, how will you beat her in the contest later, even if you do get out?

To further the rustic and friendly tone I made sure the story took place on a warm summer day. Many descriptions describe sunshine, chirping birds and the smell of pie. I aimed at letting the text point out quirky and slightly comedic tone of the puzzles the Jester left behind. The player also sometimes gets teased by the game when doing things that does not make sense.

I won't go into the story further here - it's best if you experience it yourself. Let's just say that the village has some old secrets. And and the Jester has her own ways of doing things and of telling a story. The game has multiple endings and so far people have drawn very different conclusions in the end.

Scoring

Most often in escape rooms, final score is determined by the time and the number of hints used. I do keep the latter - for every pie you eat, you get a penalty on your final score.

As for time - this background story would fit very well with a time limit (get out in X time, after which the pie-eating contest will start!). But from experience with other online text-based games I decided against this. Not only should a player be able to take a break, they may also want to wait for a friend to leave and come back etc. 

But more importantly, I want players to explore and read all my carefully crafted descriptions! So I'd much rather prefer they take their time and reward them for being thorough. 

So in the end I give specific scores for actions throughout the game instead. Most points are for doing things that drive the story forward, such as using something or solving a puzzle. But a significant portion of the score comes from turning every stone and trying everything out. The nice side-effect of this is that even if you know exactly how to solve everything and rush through the game you will still not end up with a perfect score. 

The final score, adjusted by hints is then used to determine if you make it in time to the contest and how you fare. This means that if you explore carefully you have a "buffer" of points so eating a few pies may still land you a good result in the end.
 

First sketch

I really entered the game 'building' aspect with no real notion of how the Jester's cabin should look nor which puzzles should be in it. I tried to write things down beforehand but it didn't really work for me. 

So in the end I decided "let's just put a lot of interesting stuff in the room and then I'll figure out how they interact with each other". I'm sure this is different from game-maker to game-maker. But for me, this process worked perfectly. 


Scribbles on my notebook, sketching up the room's main items
My first, very rough, sketch of the Jester's cabin

The above, first sketch ended up being what I used, although many of the objects mentioned never ended up in the final game and some things switched places. I did some other sketches too, but they'd be spoilers so I won't show them here ...


The actual game logic

The Evscaperoom principles outlined above deviate quite a bit from the traditional MU* style of game play. 

While Evennia provides everything for database management, in-game objects, commands, networking and other resources, the specifics of your game is something you need to make yourself - and you have the full power of Python to do it!

So for the first three days of the jam I used Evennia to build the custom game logic needed to provide the evscaperoom style of game play. I also made the tools I needed to quickly create the game content (which then took me the rest of the jam to make). 

In part 2 of this blog post I will cover the technical details of the Evscaperoom I built. I'll also go through some issues I ran into and conclusions I drew. I'll link to that from here when it's available!

May 18, 2019 08:52 PM UTC


Abhijeet Pal

Python Program To Print Numbers From 1 to 10 Using While Loop

Problem Definition

Create a Python program to print numbers from 1 to 10 using a while loop.

Solution

In programming, Loops are used to repeat a block of code until a specific condition is met. The While loop loops through a block of code as long as a specified condition is true.

To Learn more about working of While Loops read:

Program

i = 1
while(i<=10):
    print(i)
    i += 1

Output

1
2
3
4
5
6
7
8
9
10

The post Python Program To Print Numbers From 1 to 10 Using While Loop appeared first on Django Central.

May 18, 2019 07:48 AM UTC

Python Program To Print Numbers From 1 to 10 Using For Loop

Problem Definition

Create a Python program to print numbers from 1 to 10 using a for loop.

Solution

In programming, Loops are used to repeat a block of code until a specific condition is met. A for loop is a repetition control structure that allows you to efficiently write a loop that needs to execute a specific number of times.

Also, we are going to use one of Python’s built-in function range(). This function is extensively used in loops to control the number of types loop have to run. In simple words range is used to generate a sequence between the given values.

For a better understanding of these Python, concepts it is recommended to read the following articles.

Program

for i in range(1, 11):
    print(i)

Output

1
2
3
4
5
6
7
8
9
10

Explanation

The for loop prints the number from 1 to 10 using the range() function here i is a temporary variable which is iterating over numbers from 1 to 10.

It’s worth mentioning that similar to list indexing in range starts from 0 which means range( j )will print sequence till ( j-1) hence the output doesn’t include 6.

The post Python Program To Print Numbers From 1 to 10 Using For Loop appeared first on Django Central.

May 18, 2019 07:28 AM UTC


Programming Ideas With Jake

Better Unbound Python Descriptors

I've been making unbound attributes be way more complicated that need be. Let's look at how to simplify it.

May 18, 2019 05:00 AM UTC