skip to navigation
skip to content

Planet Python

Last update: June 24, 2018 07:47 PM

June 24, 2018


Stefan Behnel

Cython for web frameworks

I'm excited to see the Python web community pick up Cython more and more to speed up their web frameworks.

uvloop as a fast drop-in replacement for asyncio has been around for a while now, and it's mostly written in Cython as a wrapper around libuv. The Falcon web framework optionally compiles itself with Cython, while keeping up support for PyPy as a plain Python package. New projects like Vibora show that it pays off to design a framework for both (Flask-like) simplicity and (native) speed from the ground up to leverage Cython for the critical parts. Quote of the day:

"300.000 req/sec is a number comparable to Go's built-in web server (I'm saying this based on a rough test I made some years ago). Given that Go is designed to do exactly that, this is really impressive. My kudos to your choice to use Cython." – Reddit user 'beertown'.

Alex Orlov gave a talk at the PyCon-US in 2017 about using Cython for more efficient code, in which he mentioned the possibility to speed up the Django URL dispatcher by 3x, simply by compiling the module as it is.

Especially in async frameworks, minimising the time spent in processing (i.e. outside of the I/O-Loop) is critical for the overall responsiveness and performance. Anton Caceres and I presented fast async code with Cython at EuroPython 2016, showing how to speed up async coroutines by compiling and optimising them.

In order to minimise the processing time on the server, many template engines use native accelerators in one way or another, and writing those in Cython (instead of C/C++) is a huge boost in terms of maintenance (and probably also speed). But several engines also generate Python code from a templating language, and those templates tend to be way more static than not (they are rarely runtime generated themselves). Therefore, compiling the generated template code, or even better, directly targeting Cython with the code generation instead of just plain Python has the potential to speed up the template processing a lot. For example, Cython has very fast support for PEP-498 f-strings and even transforms some '%'-formatting patterns into them to speed them up (also in code that requires backwards compatibility with older Python versions). That can easily make a difference, but also the faster function and method calls or looping code that it generates.

I'm sure there's way more to come and I'm happily looking forward to all those cool developments in the web area that we are only just starting to see appear.

June 24, 2018 10:04 AM


Weekly Python StackOverflow Report

(cxxxi) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2018-06-24 06:26:59 GMT


  1. Why does `None is None is None` return True? - [36/3]
  2. Is there a better way to format these 3 integers (hours, mins, secs) to `00:00:00`? - [16/0]
  3. Why does numpy.median scale so well? - [11/1]
  4. Replace patterns of a list of any type of object similar to .replace for strings - [9/4]
  5. Expand a dict containing list items into a list of dict pairs - [8/7]
  6. Pandas dataframe to dict of dict - [8/2]
  7. Sorting a dictionary with multiple sized values - [7/2]
  8. what is the difference between 'import a.b as b' and 'from a import b' in python - [6/3]
  9. mypy, type hint: Union[float, int] -> is there a Number type? - [6/1]
  10. Best way to add pandas DataFrame column to row - [6/1]

June 24, 2018 06:27 AM

June 22, 2018


Python Bytes

#83 from __future__ import braces

June 22, 2018 08:00 AM


qutebrowser development blog

qutebrowser v1.3.3 released (security update!)

I've just released qutebrowser v1.3.3, which fixes an XSS vulnerability on the qute://history page (:history).

qutebrowser is a keyboard driven browser with a vim-like, minimalistic interface. It's written using PyQt and cross-platform.

The vulnerability allowed websites to inject HTML into the page via a crafted title tag …

June 22, 2018 12:04 AM

June 21, 2018


Trey Hunner

How to make an iterator in Python

I wrote an article sometime ago on the iterator protocol that powers Python’s for loops. One thing I left out of that article was how to make your own iterators.

In this article I’m going to discuss why you’d want to make your own iterators and then show you how to do so.

What is an iterator?

First let’s quickly address what an iterator is. For a much more detailed explanation, consider watching my Loop Better talk or reading the article based on the talk.

An iterable is anything you’re able to loop over.

An iterator is the object that does the actual iterating.

You can get an iterator from any iterable by calling the built-in iter function on the iterable.

1
2
3
>>> favorite_numbers = [6, 57, 4, 7, 68, 95]
>>> iter(favorite_numbers)
<list_iterator object at 0x7fe8e5623160>

You can use the built-in next function on an iterator to get the next item from it (you’ll get a StopIteration exception if there are no more items).

1
2
3
4
5
6
>>> favorite_numbers = [6, 57, 4, 7, 68, 95]
>>> my_iterator = iter(favorite_numbers)
>>> next(my_iterator)
6
>>> next(my_iterator)
57

There’s one more rule about iterators that makes everything interesting: iterators are also iterables and their iterator is themselves. I explain the consequences of that more fully in that Loop Better talk I mentioned above.

Why make an iterator?

Iterators allow you to make an iterable that computes its items as it goes. Which means that you can make iterables that are lazy, in that they don’t determine what their next item is until you ask them for it.

Using an iterator instead of a list, set, or another iterable data structure can sometimes allow us to save memory. For example, we can use itertools.repeat to create an iterable that provides 100 million 4’s to us:

1
2
>>> from itertools import repeat
>>> lots_of_fours = repeat(4, times=100_000_000)

This iterator takes up 56 bytes of memory on my machine:

1
2
3
>>> import sys
>>> sys.getsizeof(lots_of_fours)
56

An equivalent list of 100 million 4’s takes up many megabytes of memory:

1
2
3
4
>>> lots_of_fours = [4] * 100_000_000
>>> import sys
>>> sys.getsizeof(lots_of_fours)
800000064

While iterators can save memory, they can also save time. For example if you wanted to print out just the first line of a 10 gigabyte log file, you could do this:

1
2
>>> print(next(open('giant_log_file.txt')))
This is the first line in a giant file

File objects in Python are implemented iterators. As you loop over a file, data is read into memory one line at a time. If we instead used the readlines method to store all lines in memory, we might run out of system memory.

So iterators can save us memory, but iterators can sometimes save us time also.

Additionally, iterators have abilities that other iterables don’t. For example, the laziness of iterables can be used to make iterables that have an unknown length. In fact, you can even make infinitely long iterators.

For example, the itertools.count utility will give us an iterator that will provide every number from 0 upward as we loop over it:

1
2
3
4
5
6
7
8
>>> from itertools import count
>>> for n in count():
...     print(n)
...
0
1
2
(this goes on forever)

That itertools.count object is essentially an infinitely long iterable. And it’s implemented as an iterator.

Making an iterator: the object-oriented way

So we’ve seen that iterators can save us memory, save us CPU time, and unlock new abilities to us.

Let’s make our own iterators. We’ll start be re-inventing the itertools.count iterator object.

Here’s an iterator implemented using a class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Count:

    """Iterator that counts upward forever."""

    def __init__(self, start=0):
        self.num = start

    def __iter__(self):
        return self

    def __next__(self):
        num = self.num
        self.num += 1
        return num

This class has an initializer that initializes our current number to 0 (or whatever is passed in as the start). The things that make this class usable as an iterator are the __iter__ and __next__ methods.

When an object is passed to the str built-in function, its __str__ method is called. When an object is passed to the len built-in function, its __len__ method is called.

1
2
3
4
5
>>> numbers = [1, 2, 3]
>>> str(numbers), numbers.__str__()
('[1, 2, 3]', '[1, 2, 3]')
>>> len(numbers), numbers.__len__()
(3, 3)

Calling the built-in iter function on an object will attempt to call its __iter__ method. Calling the built-in next function on an object will attempt to call its __next__ method.

The iter function is supposed to return an iterator. So our __iter__ function must return an iterator. But our object is an iterator, so should return ourself. Therefore our Count object returns self from its __iter__ method because it is its own iterator.

The next function is supposed to return the next item in our iterator or raise a StopIteration exception when there are no more items. We’re returning the current number and incrementing the number so it’ll be larger during the next __next__ call.

We can manually loop over our Count iterator class like this:

1
2
3
4
5
>>> c = Count()
>>> next(c)
0
>>> next(c)
1

We could also loop over our Count object like using a for loop, as with any other iterable:

1
2
3
4
5
6
7
>>> for n in Count():
...     print(n)
...
0
1
2
(this goes on forever)

This object-oriented approach to making an iterator is cool, but it’s not the usual way that Python programmers make iterators. Usually when we want an iterator, we make a generator.

Generators: the easy way to make an iterator

The easiest ways to make our own iterators in Python is to create a generator.

There are two ways to make generators in Python.

Given this list of numbers:

1
>>> favorite_numbers = [6, 57, 4, 7, 68, 95]

We can make a generator that will lazily provide us with all the squares of these numbers like this:

1
2
3
4
5
>>> def square_all(numbers):
...     for n in numbers:
...         yield n**2
...
>>> squares = square_all(favorite_numbers)

Or we can make the same generator like this:

1
>>> squares = (n**2 for n in favorite_numbers)

The first one is called a generator function and the second one is called a generator expression.

Both of these generator objects work the same way. They both have a type of generator and they’re both iterators that provide squares of the numbers in our numbers list.

1
2
3
4
5
6
>>> type(squares)
<class 'generator'>
>>> next(squares)
36
>>> next(squares)
3249

We’re going to talk about both of these approaches to making a generator, but first let’s talk about terminology.

The word “generator” is used in quite a few ways in Python:

With that terminology out of the way, let’s take a look at each one of these things individually. We’ll look at generator functions first.

Generator functions

Generator functions are distinguished from plain old functions by the fact that they have one or more yield statements.

Normally when you call a function, its code is executed:

1
2
3
4
5
6
7
8
>>> def gimme4_please():
...     print("Let me go get that number for you.")
...     return 4
...
>>> num = gimme4_please()
Let me go get that number for you.
>>> num
4

But if the function has a yield statement in it, it isn’t a typical function anymore. It’s now a generator function, meaning it will return a generator object when called. That generator object can be looped over to execute it until a yield statement is hit:

1
2
3
4
5
6
7
8
9
10
11
>>> def gimme4_later_please():
...     print("Let me go get that number for you.")
...     yield 4
...
>>> get4 = gimme4_later_please()
>>> get4
<generator object gimme4_later_please at 0x7f78b2e7e2b0>
>>> num = next(get4)
Let me go get that number for you.
>>> num
4

The mere presence of a yield statement turns a function into a generator function. If you see a function and there’s a yield, you’re working with a different animal. It’s a bit odd, but that’s the way generator functions work.

Okay let’s look at a real example of a generator function. We’ll make a generator function that does the same thing as our Count iterator class we made earlier.

1
2
3
4
5
def count(start=0):
    num = start
    while True:
        yield num
        num += 1

Just like our Counter iterator class, we can manually loop over the generator we get back from calling count:

1
2
3
4
5
>>> c = count()
>>> next(c)
0
>>> next(c)
1

And we can loop over this generator object using a for loop, just like before:

1
2
3
4
5
6
7
>>> for n in count():
...     print(n)
...
0
1
2
(this goes on forever)

But this function is considerably shorter than our Count class we created before.

Generator expressions

Generator expressions are a list comprehension-like syntax that allow us to make a generator object.

Let’s say we have a list comprehension that filters empty lines from a file and strips newlines from the end:

1
2
3
4
5
lines = [
    line.rstrip('\n')
    for line in poem_file
    if line != '\n'
]

We could create a generator instead of a list, by turning the square brackets of that comprehension into parenthesis:

1
2
3
4
5
lines = (
    line.rstrip('\n')
    for line in poem_file
    if line != '\n'
)

Just as our list comprehension gave us a list back, our generator expression gives us a generator object back:

1
2
3
4
5
6
>>> type(lines)
<class 'generator'>
>>> next(lines)
' This little bag I hope will prove'
>>> next(lines)
'To be not vainly made--'

Generator expressions use a shorter inline syntax compared to generator functions. They’re not as powerful though.

If you can write your generator function in this form:

1
2
3
4
def get_a_generator(some_iterable):
    for item in some_iterable:
        if some_condition(item):
            yield item

Then you can replace it with a generator expression:

1
2
3
4
5
6
def get_a_generator(some_iterable):
    return (
        item
        for item in some_iterable
        if some_condition(item)
    )

If you can’t write your generator function in that form, then you can’t create a generator expression to replace it.

Generator expressions vs generator functions

You can think of generator expressions as the list comprehensions of the generator world.

If you’re not familiar with list comprehensions, I recommend reading my article on list comprehensions in Python. I note in that article that you can copy-paste your way from a for loop to a list comprehension.

You can also copy-paste your way from a generator function to a function that returns a generator expression:

Generator expressions are to generator functions as list comprehensions are to a simple for loop with an append and a condition.

Generator expressions are so similar to comprehensions, that you might even be tempted to say generator comprehension instead of generator expression. That’s not technically the correct name, but if you say it everyone will know what you’re talking about. Ned Batchelder actually proposed that we should all start calling generator expressions generator comprehensions and I tend to agree that this would be a clearer name.

So what’s the best way to make an iterator?

To make an iterator you could create an iterator class, a generator function, or a generator expression. Which way is the best way though?

Generator expressions are very succinct, but they’re not nearly as flexible as generator functions. Generator functions are flexible, but if you need to attach extra methods or attributes to your iterator object, you’ll probably need to switch to using an iterator class.

I’d recommend reaching for generator expressions the same way you reach for list comprehensions. If you’re doing a simple mapping or filtering operation, a generator expression is a great solution. If you’re doing something a bit more sophisticated, you’ll likely need a generator function.

I’d recommend using generator functions the same way you’d use for loops that append to a list. Everywhere you’d see an append method, you’d often see a yield statement instead.

And I’d say that you should almost never create an iterator class. If you find you need an iterator class, try to write a generator function that does what you need and see how it compares to your iterator class.

Generators can help when making iterables too

You’ll see iterator classes in the wild, but there’s rarely a good opportunity to write your own.

While it’s rare to create your own iterator class, it’s not unusual to make your own iterable class. And iterable classes require a __iter__ method which returns an iterator. Since generators are the easy way to make an iterator, we can use a generator function or a generator expression to create our __iter__ methods.

For example here’s an iterable that provides x-y coordinates:

1
2
3
4
5
6
class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y
    def __iter__(self):
        yield self.x
        yield self.y

Note that our Point class here creates an iterable when called (not an iterator). That means our __iter__ method must return an iterator. The easiest way to create an iterator is by making a generator function, so that’s just what we did.

We stuck yield in our __iter__ to make it into a generator function and now our Point class con be looped over, just like any other iterable.

1
2
3
4
5
6
>>> p = Point(1, 2)
>>> x, y = p
>>> print(x, y)
1 2
>>> list(p)
[1, 2]

Generator functions are a natural fit for creating __iter__ methods on your iterable classes.

Generators are the way to make iterators

Dictionaries are the typical way to make a mapping in Python. Functions are the typical way to make a callable object in Python. Likewise, generators are the typical way to make an iterator in Python.

So when you’re thinking “it sure would be nice to implement an iterable that lazily computes things as it’s looped over,” think of iterators.

And when you’re considering how to create your own iterator, think of generator functions and generator expressions.

June 21, 2018 11:00 PM


Wallaroo Labs

Implementing Time Windowing in an Evented Streaming System

Hi there! Welcome to the second and final installment of my trending twitter hashtags example series. In part 1, we covered the basic dataflow and logic of the application. In part 2, we are going to take a look at how windowing for the “trending” aspect of our application is implemented. When implementing any sort of “trending” application, what we are really doing is implementing some kind of windowing. That is, for some duration of time, we want to know what was popular, what was “trending” during that period of time.

June 21, 2018 04:49 PM


Andre Roberge

Javascript tools for Python hobbyists

I am just a hobbyist, Python enthusiast who has been, over the course of many years, writing what is now a relatively big Javascript program (close to 20,000 lines of code so far). If you like Python and the Pythonic way of programming, and find yourself writing more JavaScript code than you'd like for a fun side-project meant as a hobby, you may find some merit in the approach I use.  I wish I could have read something like this blog post when I started my project or even just a few years ago, when I did a major rewrite, and started using some of the tools described in this post.

If you are a professional programmer, you can just stop reading as you know much more than I do, and you surely have a better, more efficient and cutting edge way of doing things right now - and you will likely use yet a different way next year, if not next month. So, you would likely find my advice to look the same as my site: dated and not using the latest and coolest techniques - in short, for you, not worth looking at. ;-)

Summary:

Warning

This blog post is long. I've attempted to provide enough details for you to determine in each case if my use-case corresponds to yours and thus if and when my recommendation might make sense for you and your project.

The context

I started working on Reeborg's World many years ago. The first version was created as a desktop program (rur-ple) in 2004. My first primitive attempt at a web version was done around 2007.  During the years I have worked on it, tools and libraries have come, evolved, and gone, to be replaced by better ones.  As programming is only a hobby for me which I work on when I have some free time, I cannot afford to change the set of tools I use every year to follow the latest trend.

I've started working on the current version when using color gradients for buttons and menu bars was the latest and coolest thing - well before the current flat UI became the norm.


Admittedly, my site looks dated - but since I do not have enough time to add all the new ideas for functional improvements I want to make, investing time to modernize the look is not a priority.

The Javascript code I wrote is split over many files and has become a tangled mess - in spite of some occasional attempts at reorganizing the code, including a near-complete rewrite a few years ago.


Some of the complexity is required as I want to make it easier for would-be collaborator to add new programming language or paradigms for learners [1] or additional human language support [2]. However, it is likely that some of this tangled mess could be simplified with a significant effort.

In addition, there is more to Reeborg's World than a single site; there is also a basic programming tutorial available in three languages [3] with additional languages in the works, a Teacher's Guide [4], an API documentation for advanced features [5], and more [6].  Each of these act almost like an independent project pulling me in different directions.

In order to preserve my sanity, as my project slowly evolves I need some constancy and simplicity in the tools I use.

Using a well-supported library

Unlike the situation with Python, which comes "batteries included", there is no standard library for Javascript. Using a library means choosing between various alternatives, and communities.

When I started this project, the main problem facing people writing Javascript code was browser incompatibilities.  There was one obvious solution: use jQuery.  Nowadays, it is most likely no longer needed for that purpose, but that was not the case back then.

I also knew that I wanted the ability to have floating windows for additional menus and dialogs. After examining a few choices, I settled on jQuery UI, since there was good documentation for it and an active community ... and I was already using jQuery which meant a smaller footprint than some other alternatives.

Libraries like jQuery and jQuery UI can be included with a link to a CDN (Content delivery network) which can reduce the load on the server where my project lives. I can also link to a specific version of these libraries, which means that I do not have to update code that depend on them (except if security issues are discovered).

10 years later, both libraries are still alive and well and I haven't needed to make any significant changes to any code that uses them.

Use npm for installing Javascript tools

npm is described as both a package manager for Javascript and as the world's largest software respository. I use it to install various Javascript tools I use (like tape, browserify, jsdoc, etc. which I describe below). 

I do not use it to install javascript libraries (big or small) called by my own code. From what I can tell, the "best/most common" practice in the Javascript world is to make use of tons of modules found on the npm repository, some of which are simply one line of code.  Requiring a single module can mean in reality that the project can depend on dozens of other modules, none of them being vetted - unlike the Python standard library. Upgrade to a single modules can result in a bug affecting hundreds of other modules ...  For example, one developer broke Node, Babel and thousands of projects in 11 lines of JavaScript.

When I resume working on my project after months of inactivity, I never have to worry about how any change to any such third party module could require updating my code.  (Yes, there are most likely ways to mitigate such problems, but I prefer to avoid them in the first place.)

There are alternatives to npm (such as yarn, and others), but, from what I can tell, they do not offer any advantages when it comes to installing Javascript tools - a task that is performed very rarely for a given project.

Use npm to manage your workflow

When reading about Javascript, I most often saw either gulp or grunt mentioned mentioned as tools to automate tasks. From what I read, it seems that were essential to do any serious Javascript development.  Each of them came had its own way to do things ... and it was not easy for me to see which would be the best fit.  In the various posts I read about gulp vs grunt, npm was never mentioned as an alternative.

However, as I learned more about npm, I found that, together with a very simple batch file it could do all the automation that I needed in a very, very simple way, by defining "scripts" in a file named package.json.  Chaining tasks with npm scripts is a simple matter of "piping" them (with the | character).  Since I had already installed npm, it became an easy choice.

Use browserify to concatenate all your Javascript files

Once my Javascript code became much too long to fit into a single file, I broke it up into various files. With Python, I would have use an import statement in individual files to take care of dependencies. With Javascript, the only method that I knew of at the beginning of my project (10 years ago) was to add individual links in my html file. As the number of Javascript files increased, it became difficult to ensure that files were inserted in the proper order to ensure that dependencies were taken care of ... In fact, it soon became almost impossible. 

This required a major rewrite.  Fortunately, when I had to do this, some standardized way of ensuring dependencies had emerged.  The simplest was to use something like

 require("module_a.js");

at the top of, say module_b.js, and use some tools to concatenate the javascript files, ensuring that proper dependencies were taken care of.  The simplest tool I found for this purpose is browserify, originally created, as far as I can tell, by James Halliday.

browserify can be installed using npm.

Use tape for unit testing

Sigh ... I find testing boring ... But, as my project grew larger, it became necessary to write some tests.

When I did a search on testing tools/framework for Javascript, I most often saw mentions of Chai, Jasmine, Mocha QUnit and Sinon.  A recent search yields a few more potential candidates like Cucumber, Karma, etc.    

The Javascript world seems to really, really like so-called Behaviour Driven Development, where writing tests can mean writing code like:

tea.should.have.property('flavors').with.lengthOf(3);

If.I.wanted.to.write.code.that.read.like.English.I.would.likely.use.Cobol.

It is only by accident that I came accross tape as a testing framework that felt "right" to me. I like my tests to look like my code. With Python, I would use assert statements to ensure that the a function produces the correct result.  My favourite unit testing framework for Python is, not surprisingly, pytest.

From what I have seen, tape is the closest Javascript testing framework to pytest.  Here's an actual example where I test some code which is expected to raise/throw an exception/error:

test('add_wall: invalid orientation', function (assert) {
    assert.plan(2);
    try {
        RUR.add_wall("n", 1, 2, true);
    } catch (e) {
        assert.ok(e.reeborg_shouts, "reeborg_shouts");
        assert.equal(e.name, "ReeborgError", "error name ok");
    }
    assert.end();
});

I make use of "assert.plan()" to ensure that the number of assertions tested matches my expectations.

It was only after I had used tape for a while that I found out that it was also written by James Halliday.

tape can be installed using npm.

Use faucet for formatting of unit test results

Tape's output is in the TAP format (Test Anything Protocol) which, by default, is extremely verbose. Most often, it is recommended to pipe the results into formatters which produce more readable results. 

Depending on what I am doing, I use different formatters, some more verbose than others. After trying out about a dozen formatters, I now use faucet by default.  faucet can be installed using npm and has been written by, ... you guessed it, James Halliday.

Use QUnit for integration testing

Unit tests are fine, but they miss problems arising from putting all the code together.  I used different strategies to do integration testing, all of which seem to create almost more problems than they solved, until I stumbled upon a very easy way that just works for me.  Using a Python script, I take the single html file for my site, put all the code inside an html div with display set to none, insert some qunit code and my own tests, and let everything run.

Optional: use Madge for identifying circular dependencies

To help identify potential problems with circular dependencies, I use madge, which can be installed with npm.

There is one remaining dependency in my code, which I silence by not inserting a require() call in one of my modules: when the site is initialized, I want to draw a default version of the world which I by calling functions in the drawing module when loading some images. Later, when calling the drawing module, I do need the definitions found in the module where I load the images.  I could get rid of the dependencies at the cost of duplicating some code ... but since the initializing of the site and the execution of user-entered code are done in separate phases, the circular dependency does not cause any problems.

Optional: use Dependo to identify any overlooked module


The image of the tangled mess of modules shown above was created using dependo. As I was refactoring code and adding various require() statement, dependo was helpful in identifying any module not included, either because they had been accidently forgotten or because they had become irrelevant.  dependo can also be installed using npm.

Optional: use JSDoc for creating an API

While I do not particularly like it, as I cannot figure out how to extend it to address my particular needs, I found that jsdoc useful to produce an API for people wanting to use advanced features in creating unusual programming tasks (aka "worlds").  When I started using it, there did not seem to be any easy way to use Sphinx to create such API. I gather that this might no longer be the case ... but it would likely require too much effort to make the change at this point.

jsdoc can also be installed using npm.

Use jshint instead of jslint

A linter can often be useful in identifying potential or real problems with some code. When I started working on this project, the only linter I knew was jslint. jshint is friendlier and more configurable to use, and is my preferred choice. And, you guessed it, jshint can be installed using npm.

Last thoughts

There might very well be other tools that would be better for your own projects but, if you love Python and find yourself not overly enthusiastic at the thought of adopting the Javascript way when working on a project that requires Javascript, you might find that the tools I use match more closely the way you do things with Python.  Or not.



[1] Currently, programs can be written in Python, Javascript, using blockly, or in Python using a REPL.

[2] Language support can mean one of two things: either the programming library for users (like using "avance()" in French as equivalent to "move()" in English, or for the UI, or both. Currently, French and English are implemented for both, while Korean and Polish are only available for UI. Work is underway to provide Chinese support for both.

[3] The tutorial can be found here; you can change the default language using the side-bar on the right. The repository is at https://github.com/aroberge/reeborg-docsThe tutorial is currently available in French, English and Korean, with additional languages in the works.

[4] https://github.com/aroberge/reeborg-howto is a site aimed at creators of advanced tasks for Reeborg's World. It has very little content currently but will have more to be migrated from https://github.com/aroberge/reeborg-world-creation which was written as an online book (a format which I found to be unsatisfactory.)

[5] https://github.com/aroberge/reeborg-api is a documentation site for the API that creators of advanced tasks can use. 

[6] 





June 21, 2018 01:19 PM


Dan Crosta

Flask-PyMongo, Back from the Dead

Sprouting seeds homesteading.com

Long ago, when I worked at MongoDB I created Flask-PyMongo to make it easy for programmers using Flask to use the database. Fast forward almost 8 years, during which time I wasn't a consistent user of either Flask or MongoDB, and Flask-PyMongo has fallen into disrepair.

MongoDB, PyMongo, and Flask have moved on, and Flask-PyMongo hasn't been kept up to date. There are more than twice as many forks as pull requests, a GitHub ratio I'm not proud of. Fotunately, the future for Flask-PyMongo is bright.

False Starts and New Beginnings

At PyCon US 2017, I first had the idea to restore Flask-PyMongo. PyCon always has this effect on me, but sadly the effect is often short lived. In 2017, I got as far as mentioning plans for a 2.0 release, but did not go into any detail, nor begin to make any progress toward that goal.

This year at PyCon 2018, I once again had the urge to work on Flask-PyMongo. I had the same conversation with Jesse, decided, again, to jump from 0.x to 2.0, and even came up with the same technical plan as the previous year (about which more below). All without realizing that I had been down this road once before.

I can now confidently say that Flask-PyMongo 2.0 is (soon to be) a real thing, and it will set the stage for easier maintainability into the future and a better experience for users and contributors. Flask-PyMongo 2.0 will be released in early July, and pre-release versions are available today.

What's Changing

Flask-PyMongo 2.0 is not backwards compatible!

A lot of the historical problems with Flask-PyMongo have cenetered on the confusing and difficult configuration system. Originally, I envisioned that users would want configuration abstracted from PyMongo itself, and created a system where you could set Flask configurations for MONGO_HOST, MONGO_PORT, and MONGO_DBNAME, and be off to the races. For a while this worked, and many users seemed to like it. Unfortunately, there are quite a lot of configuration options for PyMongo, so the list of configurations grew. Worse, PyMongo and MongoDB are under active development, and grow and lose features over time. Attempts to make Flask-PyMongo version-agnostic added tremendous complexity to the configuration system, and evidently frustrated many users over the years.

In any event, it turns out that there's a better way to configure PyMongo -- with MongoDB URIs. Most hosted PyMongo services already provide configuration information in exactly this format. Going forward in 2.0, MongoDB URIs are the preferred configuration method for Flask-PyMongo. Flask-PyMongo will only look for or respect a single Flask configuration variable, MONGO_URI.

If you prefer, you may also pass positional and keyword arguments directly to Flask-PyMongo, which will be passed through to the underlying PyMongo MongoClient object.

Flask-PyMongo no longer supports configurating multiple instances via Flask configuration. If you wish to use multiple Flask-PyMongo instances, you must configure at least some of them using a URI or direct argument passing.

Flask-PyMongo 2.0 also clarifies the support policy for versions of Flask, PyMongo, MongoDB, and Python that are supported. For Flask and PyMongo, it supports "recent" versions -- those versions with releases in the preceding 3 years (give or take). For MongoDB, we follow the MongoDB server support policy, and support versions that are not end-of-lifed. For Python, we support 2.7 for as long as it is supported by the CPython core maintainers; and the most recent 3 versions of the 3.x series. For an exact list of supported versions and combinations, see the build matrix.

What You Should Do

If you are a Flask-PyMongo user and you are using the 0.x series, you should immediately pin a particular version. Flask-PyMongo 2.0 is not backwards compatible, so you should take steps to ensure that you don't accidentally break your application.

If you are already using a URI for Flask-PyMongo configuration, or if that is an easy change for you, I would appreciate if you could upgrade, test compatibility, and report any issues on GitHub. You can install Flask-PyMongo 2.0 pre-releases with pip install --pre flask-pymongo. You may also want to follow the general discussion and release notices in issue #110.

I also hope for Flask-PyMongo to be a place that supports Flask and MongoDB with more than just connection assistance. Please suggest ideas and propose contributions!

June 21, 2018 01:00 PM


py.CheckIO

Design Patterns. Part 1

design patterns

Well-structured code with the thought out architecture is your goal, but the ordinary books and articles seem too confusing? Then this article is for you! We’ve used the simplest analogies to describe two classic design patterns: Abstract Factory and Strategy. The article also includes Python code examples of patterns implementation and links to the coding challenges, so you can practice right away and understand how the patterns work once and for all.

June 21, 2018 09:43 AM

June 20, 2018


NumFOCUS

Ethical Algorithms — Notes from the DISC Unconference

The post Ethical Algorithms — Notes from the DISC Unconference appeared first on NumFOCUS.

June 20, 2018 08:02 PM


Curtis Miller

Learn Basic Python and scikit-learn Machine Learning Hands-On with My Course: Training Your Systems with Python Statistical Modelling

In this course I cover statistics and machine learning topics. The course assumes little knowledge about what statistics or machine learning involves. I touch lightly on the theory of statistics and machine learning to motivate the tasks performed in the videos.

June 20, 2018 05:10 PM


PyCharm

PyCharm 2018.2 EAP 4

We’re now in our fourth installment of a pretty big 2018.2 Early Access Program cycle. Lots to take a look at by downloading EAP 4 from our website.

New in PyCharm 2018.2 EAP 4

Pipenv support

We know many of you have been waiting for this for a long time, so here you go: Pipenv is supported in PyCharm 2018.2. There is still a lot of work before we finally release stable PyCharm 2018.2 so your input with bug reports or suggestions is very welcome in our issue tracker.

Currently supported Pipenv-related features in PyCharm:

pytest-bdd Support

In this EAP we introduce an initial support for pytest-bdd. To enable the pytest-bdd support open the BDD settings dialog (File | Settings/Preferences | Languages & Frameworks | BDD ) and from the Preferred BDD framework list select pytest-bdd. We’re continuing to work on py-bdd support, so your input is much appreciated.

More details on pytest-bdd support in PyCharm

Type hints validation

Any time you’re applying type hints, PyCharm checks if the type is used correctly. If there is a usage error, the corresponding warning is shown and the recommended action is suggested.

Learn more about type hints validation in PyCharm

New Front-End Development Functionality

As you might already know, PyCharm bundles all features available in WebStorm, a front-end development IDE by JetBrains. PyCharm EAP 4 adds several WebStorm EAP features:

PyCharm 2018.2 EAP 4 Release Notes

Interested?

Download this EAP from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP.

If you’re on Ubuntu 16.04 or later, you can use snap to get PyCharm EAP, and stay up to date. You can find the installation instructions on our website.

PyCharm 2018.2 is in development during the EAP phase, therefore not all new features are already available. More features will be added in the coming weeks. As PyCharm 2018.2 is pre-release software, it is not as stable as the release versions. Furthermore, we may decide to change and/or drop certain features as the EAP progresses.

All EAP versions will ship with a built-in EAP license, which means that these versions are free to use for 30 days after the day that they are built. As EAPs are released weekly, you’ll be able to use PyCharm Professional Edition EAP for free for the duration of the EAP program, as long as you upgrade at least once every 30 days.

June 20, 2018 04:12 PM


Real Python

Operators and Expressions in Python

After finishing our previous tutorial on Python variables in this series, you should now have a good grasp of creating and naming Python objects of different types. Let’s do some work with them!

Here’s what you’ll learn in this tutorial: You’ll see how calculations can be performed on objects in Python. By the end of this tutorial, you will be able to create complex expressions by combining objects and operators.

Don't miss the follow up tutorial: Click here to join the Real Python Newsletter and you'll know when the next installment comes out.

In Python, operators are special symbols that designate that some sort of computation should be performed. The values that an operator acts on are called operands.

Here is an example:

>>> a = 10
>>> b = 20
>>> a + b
30

In this case, the + operator adds the operands a and b together. An operand can be either a literal value or a variable that references an object:

>>> a = 10
>>> b = 20
>>> a + b - 5
25

A sequence of operands and operators, like a + b - 5, is called an expression. Python supports many operators for combining data objects into expressions. These are explored below.

Arithmetic Operators

The following table lists the arithmetic operators supported by Python:

Operator Example Meaning Result
+ (unary) +a Unary Positive a
In other words, it doesn’t really do anything. It mostly exists for the sake of completeness, to complement Unary Negation.
+ (binary) a + b Addition Sum of a and b
- (unary) -a Unary Negation Value equal to a but opposite in sign
- (binary) a - b Subtraction b subtracted from a
* a * b Multiplication Product of a and b
/ a / b Division Quotient when a is divided by b.
The result always has type float.
% a % b Modulus Remainder when a is divided by b
// a // b Floor Division (also called Integer Division) Quotient when a is divided by b, rounded to the next smallest whole number
** a ** b Exponentiation a raised to the power of b

Here are some examples of these operators in use:

>>> a = 4
>>> b = 3
>>> +a
4
>>> -b
-3
>>> a + b
7
>>> a - b
1
>>> a * b
12
>>> a / b
1.3333333333333333
>>> a % b
1
>>> a ** b
64

The result of standard division (/) is always a float, even if the dividend is evenly divisible by the divisor:

>>> 10 / 5
2.0
>>> type(10 / 5)
<class 'float'>

When the result of floor division (//) is positive, it is as though the fractional portion is truncated off, leaving only the integer portion. When the result is negative, the result is rounded down to the next smallest (greater negative) integer:

>>> 10 / 4
2.5
>>> 10 // 4
2
>>> 10 // -4
-3
>>> -10 // 4
-3
>>> -10 // -4
2

Note, by the way, that in a REPL session, you can display the value of an expression by just typing it in at the >>> prompt without print(), the same as you can with a literal value or variable:

>>> 25
25
>>> x = 4
>>> y = 6
>>> x
4
>>> y
6
>>> x * 25 + y
106

Comparison Operators

Operator Example Meaning Result
== a == b Equal to True if the value of a is equal to the value of b
False otherwise
!= a != b Not equal to True if a is not equal to b
False otherwise
< a < b Less than True if a is less than b
False otherwise
<= a <= b Less than or equal to True if a is less than or equal to b
False otherwise
> a > b Greater than True if a is greater than b
False otherwise
>= a >= b Greater than or equal to True if a is greater than or equal to b
False otherwise

Here are examples of the comparison operators in use:

>>> a = 10
>>> b = 20
>>> a == b
False
>>> a != b
True
>>> a <= b
True
>>> a >= b
False

>>> a = 30
>>> b = 30
>>> a == b
True
>>> a <= b
True
>>> a >= b
True

Comparison operators are typically used in Boolean contexts like conditional and loop statements to direct program flow, as you will see later.

Equality Comparison on Floating-Point Values

Recall from the earlier discussion of floating-point numbers that the value stored internally for a float object may not be precisely what you’d think it would be. For that reason, it is poor practice to compare floating-point values for exact equality. Consider this example:

>>> x = 1.1 + 2.2
>>> x == 3.3
False

Yikes! The internal representations of the addition operands are not exactly equal to 1.1 and 2.2, so you cannot rely on x to compare exactly to 3.3.

The preferred way to determine whether two floating-point values are “equal” is to compute whether they are close to one another, given some tolerance. Take a look at this example:

>>> tolerance = 0.00001
>>> x = 1.1 + 2.2
>>> abs(x - 3.3) < tolerance
True

abs() returns absolute value. If the absolute value of the difference between the two numbers is less than the specified tolerance, they are close enough to one another to be considered equal.

Logical Operators

The logical operators not, or, and and modify and join together expressions evaluated in Boolean context to create more complex conditions.

Logical Expressions Involving Boolean Operands

As you have seen, some objects and expressions in Python actually are of Boolean type. That is, they are equal to one of the Python objects True or False. Consider these examples:

>>> x = 5
>>> x < 10
True
>>> type(x < 10)
<class 'bool'>

>>> t = x > 10
>>> t
False
>>> type(t)
<class 'bool'>

>>> callable(x)
False
>>> type(callable(x))
<class 'bool'>

>>> t = callable(len)
>>> t
True
>>> type(t)
<class 'bool'>

In the examples above, x < 10, callable(x), and t are all Boolean objects or expressions.

Interpretation of logical expressions involving not, or, and and is straightforward when the operands are Boolean:

Operator Example Meaning
not not x True if x is False
False if x is True
(Logically reverses the sense of x)
or x or y True if either x or y is True
False otherwise
and x and y True if both x and y are True
False otherwise

Take a look at how they work in practice below.

not” and Boolean Operands

x = 5
not x < 10
False
not callable(x)
True
Operand Value Logical Expression Value
x < 10 True not x < 10 False
callable(x) False not callable(x) True

or” and Boolean Operands

x = 5
x < 10 or callable(x)
True
x < 0 or callable(x)
False
Operand Value Operand Value Logical Expression Value
x < 10 True callable(x) False x < 10 or callable(x) True
x < 0 False callable(x) False x < 0 or callable(x) False

and” and Boolean Operands

x = 5
x < 10 and callable(x)
False
x < 10 and callable(len)
True
Operand Value Operand Value Logical Expression Value
x < 10 True callable(x) False x < 10 and callable(x) False
x < 10 True callable(len) True x < 10 or callable(len) True

Evaluation of Non-Boolean Values in Boolean Context

Many objects and expressions are not equal to True or False. Nonetheless, they may still be evaluated in Boolean context and determined to be “truthy” or “falsy.”

So what is true and what isn’t? As a philosophical question, that is outside the scope of this tutorial!

But in Python, it is well-defined. All the following are considered false when evaluated in Boolean context:

Virtually any other object built into Python is regarded as true.

You can determine the “truthiness” of an object or expression with the built-in bool() function. bool() returns True if its argument is truthy and False if it is falsy.

Numeric Value

A zero value is false.
A non-zero value is true.

>>> print(bool(0), bool(0.0), bool(0.0+0j))
False False False

>>> print(bool(-3), bool(3.14159), bool(1.0+1j))
True True True

String

An empty string is false.
A non-empty string is true.

>>> print(bool(''), bool(""), bool(""""""))
False False False

>>> print(bool('foo'), bool(" "), bool(''' '''))
True True True

Built-In Composite Data Object

Python provides built-in composite data types called list, tuple, dict, and set. These are “container” types that contain other objects. An object of one of these types is considered false if it is empty and true if it is non-empty.

The examples below demonstrate this for the list type. (Lists are defined in Python with square brackets.)

For more information on the list, tuple, dict, and set types, see the upcoming tutorials.

>>> type([])
<class 'list'>
>>> bool([])
False

>>> type([1, 2, 3])
<class 'list'>
>>> bool([1, 2, 3])
True

The “None” Keyword

None is always false:

>>> bool(None)
False

Logical Expressions Involving Non-Boolean Operands

Non-Boolean values can also be modified and joined by not, or and, and. The result depends on the “truthiness” of the operands.

not” and Non-Boolean Operands

Here is what happens for a non-Boolean value x:

If x is not x is
“truthy” False
“falsy” True

Here are some concrete examples:

>>> x = 3
>>> bool(x)
True
>>> not x
False

>>> x = 0.0
>>> bool(x)
False
>>> not x
True

or” and Non-Boolean Operands

This is what happens for two non-Boolean values x and y:

If x is x or y is
truthy x
falsy y

Note that in this case, the expression x or y does not evaluate to either True or False, but instead to one of either x or y:

>>> x = 3
>>> y = 4
>>> x or y
3

>>> x = 0.0
>>> y = 4.4
>>> x or y
4.4

Even so, it is still the case that the expression x or y will be truthy if either x or y is truthy, and falsy if both x and y are falsy.

and” and Non-Boolean Operands

Here’s what you’ll get for two non-Boolean values x and y:

If x is x and y is
“truthy” y
“falsy” x
>>> x = 3
>>> y = 4
>>> x and y
4

>>> x = 0.0
>>> y = 4.4
>>> x and y
0.0

As with or, the expression x and y does not evaluate to either True or False, but instead to one of either x or y. x and y will be truthy if both x and y are truthy, and falsy otherwise.

Compound Logical Expressions and Short-Circuit Evaluation

So far, you have seen expressions with only a single or or and operator and two operands:

x or y
x and y

Multiple logical operators and operands can be strung together to form compound logical expressions.

Compound “or” Expressions

Consider the following expression:

x1 or x2 or x3 orxn

This expression is true if any of the xi are true.

In an expression like this, Python uses a methodology called short-circuit evaluation, also called McCarthy evaluation in honor of computer scientist John McCarthy. The xi operands are evaluated in order from left to right. As soon as one is found to be true, the entire expression is known to be true. At that point, Python stops and no more terms are evaluated. The value of the entire expression is that of the xi that terminated evaluation.

To help demonstrate short-circuit evaluation, suppose that you have a simple “identity” function f() that behaves as follows:

(You will see how to define such a function in the upcoming tutorial on Functions.)

Several example calls to f() are shown below:

>>> f(0)
-> f(0) = 0
0

>>> f(False)
-> f(False) = False
False

>>> f(1.5)
-> f(1.5) = 1.5
1.5

Because f() simply returns the argument passed to it, we can make the expression f(arg) be truthy or falsy as needed by specifying a value for arg that is appropriately truthy or falsy. Additionally, f() displays its argument to the console, which visually confirms whether or not it was called.

Now, consider the following compound logical expression:

>>> f(0) or f(False) or f(1) or f(2) or f(3)
-> f(0) = 0
-> f(False) = False
-> f(1) = 1
1

The interpreter first evaluates f(0), which is 0. A numeric value of 0 is false. The expression is not true yet, so evaluation proceeds left to right. The next operand, f(False), returns False. That is also false, so evaluation continues.

Next up is f(1). That evaluates to 1, which is true. At that point, the interpreter stops because it now knows the entire expression to be true. 1 is returned as the value of the expression, and the remaining operands, f(2) and f(3), are never evaluated. You can see from the display that the f(2) and f(3) calls do not occur.

Compound “and” Expressions

A similar situation exists in an expression with multiple and operators:

x1 and x2 and x3 andxn

This expression is true if all the xi are true.

In this case, short-circuit evaluation dictates that the interpreter stop evaluating as soon as any operand is found to be false, because at that point the entire expression is known to be false. Once that is the case, no more operands are evaluated, and the falsy operand that terminated evaluation is returned as the value of the expression:

>>> f(1) and f(False) and f(2) and f(3)
-> f(1) = 1
-> f(False) = False
False

>>> f(1) and f(0.0) and f(2) and f(3)
-> f(1) = 1
-> f(0.0) = 0.0
0.0

In both examples above, evaluation stops at the first term that is false—f(False) in the first case, f(0.0) in the second case—and neither the f(2) nor f(3) call occurs. False and 0.0, respectively, are returned as the value of the expression.

If all the operands are truthy, they all get evaluated and the last (rightmost) one is returned as the value of the expression:

>>> f(1) and f(2.2) and f('bar')
-> f(1) = 1
-> f(2.2) = 2.2
-> f(bar) = bar
'bar'

Idioms That Exploit Short-Circuit Evaluation

There are some common idiomatic patterns that exploit short-circuit evaluation for conciseness of expression.

Avoiding an Exception

Suppose you have defined two variables a and b, and you want to know whether (b / a) > 0:

>>> a = 3
>>> b = 1
>>> (b / a) > 0
True

But you need to account for the possibility that a might be 0, in which case the interpreter will raise an exception:

>>> a = 0
>>> b = 1
>>> (b / a) > 0
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    (b / a) > 0
ZeroDivisionError: division by zero

You can avoid an error with an expression like this:

>>> a = 0
>>> b = 1
>>> a != 0 and (b / a) > 0
False

When a is 0, a != 0 is false. Short-circuit evaluation ensures that evaluation stops at that point. (b / a) is not evaluated, and no error is raised.

If fact, you can be even more concise than that. When a is 0, the expression a by itself is falsy. There is no need for the explicit comparison a != 0:

>>> a = 0
>>> b = 1
>>> a and (b / a) > 0
0

Selecting a Default Value

Another idiom involves selecting a default value when a specified value is zero or empty. For example, suppose you want to assign a variable s to the value contained in another variable called string. But if string is empty, you want to supply a default value.

Here is a concise way of expressing this using short-circuit evaluation:

s = string or '<default_value>'

If string is non-empty, it is truthy, and the expression string or '<default_value>' will be true at that point. Evaluation stops, and the value of string is returned and assigned to s:

>>> string = 'foo bar'
>>> s = string or '<default_value>'
>>> s
'foo bar'

On the other hand, if string is an empty string, it is falsy. Evaluation of string or '<default_value>' continues to the next operand, '<default_value>', which is returned and assigned to s:

>>> string = ''
>>> s = string or '<default_value>'
>>> s
'<default_value>'

Chained Comparisons

Comparison operators can be chained together to arbitrary length. For example, the following expressions are nearly equivalent:

x < y <= z
x < y and y <= z

They will both evaluate to the same Boolean value. The subtle difference between the two is that in the chained comparison x < y <= z, y is evaluated only once. The longer expression x < y and y <= z will cause y to be evaluated twice.

Note: In cases where y is a static value, this will not be a significant distinction. But consider these expressions:

x < f() <= z
x < f() and f() <= z

If f() is a function that causes program data to be modified, the difference between its being called once in the first case and twice in the second case may be important.

More generally, if op1, op2, …, opn are comparison operators, then the following have the same Boolean value:

x1 op1 x2 op2 x3 … xn-1 opn xn

x1 op1 x2 and x2 op2 x3 and … xn-1 opn xn

In the former case, each xi is only evaluated once. In the latter case, each will be evaluated twice except the first and last, unless short-circuit evaluation causes premature termination.

Bitwise Operators

Bitwise operators treat operands as sequences of binary digits and operate on them bit by bit. The following operators are supported:

Operator Example Meaning Result
& a & b bitwise AND Each bit position in the result is the logical AND of the bits in the corresponding position of the operands. (1 if both are 1, otherwise 0.)
| a | b bitwise OR Each bit position in the result is the logical OR of the bits in the corresponding position of the operands. (1 if either is 1, otherwise 0.)
~ ~a bitwise negation Each bit position in the result is the logical negation of the bit in the corresponding position of the operand. (1 if 0, 0 if 1.)
^ a ^ b bitwise XOR (exclusive OR) Each bit position in the result is the logical XOR of the bits in the corresponding position of the operands. (1 if the bits in the operands are different, 0 if they are the same.)
>> a >> n Shift right n places Each bit is shifted right n places.
<< a << n Shift left n places Each bit is shifted left n places.

Here are some examples:

>>> '0b{:04b}'.format(0b1100 & 0b1010)
'0b1000'
>>> '0b{:04b}'.format(0b1100 | 0b1010)
'0b1110'
>>> '0b{:04b}'.format(0b1100 ^ 0b1010)
'0b0110'
>>> '0b{:04b}'.format(0b1100 >> 2)
'0b0011'
>>> '0b{:04b}'.format(0b0011 << 2)
'0b1100'

Note: The purpose of the '0b{:04b}'.format() is to format the numeric output of the bitwise operations, to make them easier to read. You will see the format() method in much more detail later. For now, just pay attention to the operands of the bitwise operations, and the results.

Identity Operators

Python provides two operators, is and is not, that determine whether the given operands have the same identity—that is, refer to the same object. This is not the same thing as equality, which means the two operands refer to objects that contain the same data but are not necessarily the same object.

Here is an example of two object that are equal but not identical:

>>> x = 1001
>>> y = 1000 + 1
>>> print(x, y)
1001 1001

>>> x == y
True
>>> x is y
False

Here, x and y both refer to objects whose value is 1001. They are equal. But they do not reference the same object, as you can verify:

>>> id(x)
60307920
>>> id(y)
60307936

x and y do not have the same identity, and x is y returns False.

You saw previously that when you make an assignment like x = y, Python merely creates a second reference to the same object, and that you could confirm that fact with the id() function. You can also confirm it using the is operator:

>>> a = 'I am a string'
>>> b = a
>>> id(a)
55993992
>>> id(b)
55993992

>>> a is b
True
>>> a == b
True

In this case, since a and b reference the same object, it stands to reason that a and b would be equal as well.

Unsurprisingly, the opposite of is is is not:

>>> x = 10
>>> y = 20
>>> x is not y
True

Operator Precedence

Consider this expression:

>>> 20 + 4 * 10
60

There is ambiguity here. Should Python perform the addition 20 + 4 first and then multiply the sum by 10? Or should the multiplication 4 * 10 be performed first, and the addition of 20 second?

Clearly, since the result is 60, Python has chosen the latter; if it had chosen the former, the result would be 240. This is standard algebraic procedure, found universally in virtually all programming languages.

All operators that the language supports are assigned a precedence. In an expression, all operators of highest precedence are performed first. Once those results are obtained, operators of the next highest precedence are performed. So it continues, until the expression is fully evaluated. Any operators of equal precedence are performed in left-to-right order.

Here is the order of precedence of the Python operators you have seen so far, from lowest to highest:

  Operator Description
lowest precedence or Boolean OR
and Boolean AND
not Boolean NOT
==, !=, <, <=, >, >=, is, is not comparisons, identity
| bitwise OR
^ bitwise XOR
& bitwise AND
<<, >> bit shifts
+, - addition, subtraction
*, /, //, % multiplication, division, floor division, modulo
+x, -x, ~x unary positive, unary negation, bitwise negation
highest precedence ** exponentiation

Operators at the top of the table have the lowest precedence, and those at the bottom of the table have the highest. Any operators in the same row of the table have equal precedence.

It is clear why multiplication is performed first in the example above: multiplication has a higher precedence than addition.

Similarly, in the example below, 3 is raised to the power of 4 first, which equals 81, and then the multiplications are carried out in order from left to right (2 * 81 * 5 = 810):

>>> 2 * 3 ** 4 * 5
810

Operator precedence can be overridden using parentheses. Expressions in parentheses are always performed first, before expressions that are not parenthesized. Thus, the following happens:

>>> 20 + 4 * 10
60
>>> (20 + 4) * 10
240

>>> 2 * 3 ** 4 * 5
810
>>> 2 * 3 ** (4 * 5)
6973568802

In the first example, 20 + 4 is computed first, then the result is multiplied by 10. In the second example, 4 * 5 is calculated first, then 3 is raised to that power, then the result is multiplied by 2.

There is nothing wrong with making liberal use of parentheses, even when they aren’t necessary to change the order of evaluation. In fact, it is considered good practice, because it can make the code more readable, and it relieves the reader of having to recall operator precedence from memory. Consider the following:

(a < 10) and (b > 30)

Here the parentheses are fully unnecessary, as the comparison operators have higher precedence than and does and would have been performed first anyhow. But some might consider the intent of the parenthesized version more immediately obvious than this version without parentheses:

a < 10 and b > 30

On the other hand, there are probably those who would prefer the latter; it’s a matter of personal preference. The point is, you can always use parentheses if you feel it makes the code more readable, even if they aren’t necessary to change the order of evaluation.

Augmented Assignment Operators

You have seen that a single equal sign (=) is used to assign a value to a variable. It is, of course, perfectly viable for the value to the right of the assignment to be an expression containing other variables:

>>> a = 10
>>> b = 20
>>> c = a * 5 + b
>>> c
70

In fact, the expression to the right of the assignment can include references to the variable that is being assigned to:

>>> a = 10
>>> a = a + 5
>>> a
15

>>> b = 20
>>> b = b * 3
>>> b
60

The first example is interpreted as “a is assigned the current value of a plus 5,” effectively increasing the value of a by 5. The second reads “b is assigned the current value of b times 3,” effectively increasing the value of b threefold.

Of course, this sort of assignment only makes sense if the variable in question has already previously been assigned a value:

>>> z = z / 12
Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    z = z / 12
NameError: name 'z' is not defined

Python supports a shorthand augmented assignment notation for these arithmetic and bitwise operators:

Arithmetic Bitwise
+
-
*
/
%
//
**
&
|
^
>>
<<

For these operators, the following are equivalent:

x <op>= y
x = x <op> y

Take a look at these examples:

Augmented
Assignment
Standard
Assignment
a += 5 is equivalent to a = a + 5
a /= 10 is equivalent to a = a / 10
a ^= b is equivalent to a = a ^ b

Conclusion

In this tutorial, you learned about the diverse operators Python supports to combine objects into expressions.

Most of the examples you have seen so far have involved only simple atomic data, but you saw a brief introduction to the string data type. The next tutorial will explore string objects in much more detail.

Don't miss the follow up tutorial: Click here to join the Real Python Newsletter and you'll know when the next installment comes out.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 20, 2018 02:00 PM


Peter Bengtsson

A good Django view function cache decorator for http.JsonResponse

I use this a lot. It has served me very well. The code:

import hashlib
import functools

import markus  # optional
from django.core.cache import cache
from django import http
from django.utils.encoding import force_bytes, iri_to_uri

metrics = markus.get_metrics(__name__)  # optional


def json_response_cache_page_decorator(seconds):
    """Cache only when there's a healthy http.JsonResponse response."""

    def decorator(func):

        @functools.wraps(func)
        def inner(request, *args, **kwargs):
            cache_key = 'json_response_cache:{}:{}'.format(
                func.__name__,
                hashlib.md5(force_bytes(iri_to_uri(
                    request.build_absolute_uri()
                ))).hexdigest()
            )
            content = cache.get(cache_key)
            if content is not None:

                # metrics is optional
                metrics.incr(
                    'json_response_cache_hit',
                    tags=['view:{}'.format(func.__name__)]
                )

                return http.HttpResponse(
                    content,
                    content_type='application/json'
                )
            response = func(request, *args, **kwargs)
            if (
                isinstance(response, http.JsonResponse) and
                response.status_code in (200, 304)
            ):
                cache.set(cache_key, response.content, seconds)
            return response

        return inner

    return decorator

To use it simply add to Django view functions that might return a http.JsonResponse. For example, something like this:

@json_response_cache_page_decorator(60)
def search(request):
    q = request.GET.get('q')
    if not q:
        return http.HttpResponseBadRequest('no q')
    results = search_database(q)
    return http.JsonResponse({
        'results': results,
    })

The reasons I use this instead of django.views.decorators.cache.cache_page() is because of a couple of reasons.

Disclaimer: This snippet of code comes from a side-project that has a very specific set of requirements. They're rather unique to that project and I have a full picture of the needs. E.g. I know what specific headers matter and don't matter. Your project might be different. For example, perhaps you don't have markus to handle your metrics. Or perhaps you need to re-write the query string for something to normalize the cache key differently. Point being, take the snippet of code as inspiration when you too find that django.views.decorators.cache.cache_page() isn't good enough for your Django view functions.

June 20, 2018 01:55 PM


PyPy Development

Repeating a Matrix Multiplication Benchmark

I watched the Hennessy & Patterson's Turing award lecture recently:

In it, there's a slide comparing the performance of various matrix multiplication implementations, using Python (presumably CPython) as a baseline and comparing that against various C implementations (I couldn't find the linked paper yet):

I expected the baseline speedup of switching from CPython to C to be higher and I also wanted to know what performance PyPy gets, so I did my own benchmarks. This is a problem that Python is completely unsuited for, so it should give very exaggerated results.

The usual disclaimers apply: All benchmarks are lies, benchmarking of synthetic workloads even more so. My implementation is really naive (though I did optimize it a little bit to help CPython), don't use any of this code for anything real. The benchmarks ran on my rather old Intel i5-3230M laptop under Ubuntu 17.10.

With that said, my results were as follows:

Implementation time speedup over CPython speedup over PyPy
CPython 512.588 ± 2.362 s 1 ×
PyPy 8.167 ± 0.007 s 62.761 ± 0.295 × 1 ×
'naive' C 2.164 ± 0.025 s 236.817 ± 2.918 × 3.773 ± 0.044 ×
NumPy 0.171 ± 0.002 s 2992.286 ± 42.308 × 47.678 ± 0.634 ×

This is running 1500x1500 matrix multiplications with (the same) random matrices. Every implementation is run 50 times in a fresh process. The results are averaged, the errors are bootstrapped 99% confidence intervals.

So indeed the speedup that I got of switching from CPython to C is quite a bit higher than 47x! PyPy is much better than CPython, but of course can't really compete against GCC. And then the real professionals (numpy/OpenBLAS) are in a whole 'nother league. The speedup of the AVX numbers in the slide above is even higher than my NumPy numbers, which I assume is the result of my old CPU with two cores, vs. the 18 core CPU with AVX support. Lesson confirmed: leave matrix multiplication to people who actually know what they are doing.

June 20, 2018 01:45 PM


Yasoob Khalid

An Intro to Web Scraping With lxml and Python

Hello everyone! I hope you are doing well. In this article, I’ll teach you the basics of web scraping using lxml and Python. I also recorded this tutorial in a screencast so if you prefer to watch me do this step by step in a video please go ahead and watch it below. However, if for some reason you decide that you prefer text, just scroll a bit more and you will find the text of that same screencast.

First of all, why should you even bother learning how to web scrape? If your job doesn’t require you to learn it, then let me give you some motivation. What if you want to create a website which curates cheapest products from Amazon, Walmart and a couple of other online stores? A lot of these online stores don’t provide you with an easy way to access their information using an API. In the absence of an API, your only choice is to create a web scraper which can extract information from these websites automatically and provide you with that information in an easy to use way.

Here is an example of a typical API response in JSON. This is the response from Reddit:

Typical API Response in JSON

There are a lot of Python libraries out there which can help you with web scraping. There is lxml, BeautifulSoup and a full-fledged framework called Scrapy. Most of the tutorials discuss BeautifulSoup and Scrapy, so I decided to go with lxml in this post. I will teach you the basics of XPaths and how you can use them to extract data from an HTML document. I will take you through a couple of different examples so that you can quickly get up-to-speed with lxml and XPaths.

If you are a gamer, you will already know of (and likely love) this website. We will be trying to extract data from Steam. More specifically, we will be selecting from the “popular new releases” information. I am converting this into a two-part series. In this part, we will be creating a Python script which can extract the names of the games, the prices of the games, the different tags associated with each game and the target platforms. In the second part, we will turn this script into a Flask based API and then host it on Heroku.

Steam Popular New Releases

Step 1: Exploring Steam

First of all, open up the “popular new releases” page on Steam and scroll down until you see the Popular New Releases tab. At this point, I usually open up Chrome developer tools and see which HTML tags contain the required data. I extensively use the element inspector tool (The button in the top left of the developer tools). It allows you to see the HTML markup behind a specific element on the page with just one click. As a high-level overview, everything on a web page is encapsulated in an HTML tag and tags are usually nested. You need to figure out which tags you need to extract the data from and you are good to go. In our case, if we take a look, we can see that every separate list item is encapsulated in an anchor (a) tag.

The anchor tags themselves are encapsulated in the div with an id of tab_newreleases_content. I am mentioning the id because there are two tabs on this page. The second tab is the standard “New Releases” tab, and we don’t want to extract information from that tab. Hence, we will first extract the “Popular New Releases” tab, and then we will extract the required information from this tag.

Step 2: Start writing a Python script

This is a perfect time to create a new Python file and start writing down our script. I am going to create a scrape.py file. Now let’s go ahead and import the required libraries. The first one is the requests library and the second one is the lxml.html library.

import requests
import lxml.html

If you don’t have requests installed, you can easily install it by running this command in the terminal:

$ pip install requests

The requests library is going to help us open the web page in Python. We could have used lxml to open the HTML page as well but it doesn’t work well with all web pages so to be on the safe side I am going to use requests.

Now let’s open up the web page using requests and pass that response to lxml.html.fromstring.

html = requests.get('https://store.steampowered.com/explore/new/')
doc = lxml.html.fromstring(html.content)

This provides us with an object of HtmlElement type. This object has the xpath method which we can use to query the HTML document. This provides us with a structured way to extract information from an HTML document.

Step 3: Fire up the Python Interpreter

Now save this file and open up a terminal. Copy the code from the scrape.py file and paste it in a Python interpreter session.

Python Terminal

We are doing this so that we can quickly test our XPaths without continuously editing, saving and executing our scrape.py file.

Let’s try writing an XPath for extracting the div which contains the ‘Popular New Releases’ tab. I will explain the code as we go along:

new_releases = doc.xpath('//div[@id="tab_newreleases_content"]')[0]

This statement will return a list of all the divs in the HTML page which have an id of tab_newreleases_content. Now because we know that only one div on the page has this id we can take out the first element from the list ([0]) and that would be our required div. Let’s break down the xpath and try to understand it:

Cool! We have got the required div. Now let’s go back to chrome and check which tag contains the titles of the releases.

Step 4: Extract the titles & prices

Extract title from steam releases

The title is contained in a div with a class of tab_item_name. Now that we have the “Popular New Releases” tab extracted we can run further XPath queries on that tab. Write down the following code in the same Python console which we previously ran our code in:

titles = new_releases.xpath('.//div[@class="tab_item_name"]/text()')

This gives us with the titles of all of the games in the “Popular New Releases” tab. Here is the expected output:

title from steam releases in terminal

Let’s break down this XPath a little bit because it is a bit different from the last one.

Now we need to extract the prices for the games. We can easily do that by running the following code:

prices = new_releases.xpath('.//div[@class="discount_final_price"]/text()')

I don’t think I need to explain this code as it is pretty similar to the title extraction code. The only change we made is the change in the class name.

Extracting prices from steam

Step 5: Extracting tags

Now we need to extract the tags associated with the titles. Here is the HTML markup:

HTML markup

Write down the following code in the Python terminal to extract the tags:

tags = new_releases.xpath('.//div[@class="tab_item_top_tags"]')
total_tags = []
for tag in tags:
    total_tags.append(tag.text_content())

So what we are doing here is that we are extracting the divs containing the tags for the games. Then we loop over the list of extracted tags and then extract the text from those tags using the text_content() method. text_content() returns the text contained within an HTML tag without the HTML markup.

Note: We could have also made use of a list comprehension to make that code shorter. I wrote it down in this way so that even those who don’t know about list comprehensions can understand the code. Eitherways, this is the alternate code:

tags = [tag.text_content() for tag in new_releases.xpath('.//div[@class="tab_item_top_tags"]')]

Lets separate the tags in a list as well so that each tag is a separate element:

tags = [tag.split(', ') for tag in tags]

Step 6: Extracting the platforms

Now the only thing remaining is to extract the platforms associated with each title. Here is the HTML markup:

HTML markup

The major difference here is that the platforms are not contained as texts within a specific tag. They are listed as the class name. Some titles only have one platform associated with them like this:

<span class="platform_img win"></span>

While some titles have 5 platforms associated with them like this:

<span class="platform_img win"></span>
<span class="platform_img mac"></span>
<span class="platform_img linux"></span>
<span class="platform_img hmd_separator"></span>
<span title="HTC Vive" class="platform_img htcvive"></span>
<span title="Oculus Rift" class="platform_img oculusrift"></span>

As we can see these spans contain the platform type as the class name. The only common thing between these spans is that all of them contain the platform_img class. First of all, we will extract the divs with the tab_item_details class, then we will extract the spans containing the platform_img class and finally we will extract the second class name from those spans. Here is the code:

platforms_div = new_releases.xpath('.//div[@class="tab_item_details"]')
total_platforms = []

for game in platforms_div:
    temp = game.xpath('.//span[contains(@class, "platform_img")]')
    platforms = [t.get('class').split(' ')[-1] for t in temp]
    if 'hmd_separator' in platforms:
        platforms.remove('hmd_separator')
    total_platforms.append(platforms)

In line 1 we start with extracting the tab_item_details div. The XPath in line 5 is a bit different. Here we have [contains(@class, "platform_img")] instead of simply having [@class="platform_img"]. The reason is that [@class="platform_img"] returns those spans which only have the platform_img class associated with them. If the spans have an additional class, they won’t be returned. Whereas [contains(@class, "platform_img")] filters all the spans which have the platform_img class. It doesn’t matter whether it is the only class or if there are more classes associated with that tag.

In line 6 we are making use of a list comprehension to reduce the code size. The .get() method allows us to extract an attribute of a tag. Here we are using it to extract the class attribute of a span. We get a string back from the .get() method. In case of the first game, the string being returned is platform_img win so we split that string based on the comma and the whitespace, and then we store the last part (which is the actual platform name) of the split string in the list.

In lines 7-8 we are removing the hmd_separator from the list if it exists. This is because hmd_separator is not a platform. It is just a vertical separator bar used to separate actual platforms from VR/AR hardware.

Step 7: Conclusion

This is the code we have so far:

import requests
import lxml.html

html = requests.get('https://store.steampowered.com/explore/new/')
doc = lxml.html.fromstring(html.content)

new_releases = doc.xpath('//div[@id="tab_newreleases_content"]')[0]

titles = new_releases.xpath('.//div[@class="tab_item_name"]/text()')
prices = new_releases.xpath('.//div[@class="discount_final_price"]/text()')

tags = [tag.text_content() for tag in new_releases.xpath('.//div[@class="tab_item_top_tags"]')]
tags = [tag.split(', ') for tag in tags]

platforms_div = new_releases.xpath('.//div[@class="tab_item_details"]')
total_platforms = []

for game in platforms_div:
    temp = game.xpath('.//span[contains(@class, "platform_img")]')
    platforms = [t.get('class').split(' ')[-1] for t in temp]
    if 'hmd_separator' in platforms:
        platforms.remove('hmd_separator')
    total_platforms.append(platforms)

Now we just need this to return a JSON response so that we can easily turn this into a Flask based API. Here is the code:

output = []
for info in zip(titles,prices, tags, total_platforms):
    resp = {}
    resp['title'] = info[0]
    resp['price'] = info[1]
    resp['tags'] = info[2]
    resp['platforms'] = info[3]
    output.append(resp)

This code is self-explanatory. We are using the zip function to loop over all of those lists in parallel. Then we create a dictionary for each game and assign the title, price, tags, and platforms as a separate key in that dictionary. Lastly, we append that dictionary to the output list.

In a future post, we will take a look at how we can convert this into a Flask based API and host it on Heroku.

Have a great day!

Note: This article first appeared on Timber.io

June 20, 2018 07:46 AM

June 19, 2018


Artem Golubin

How many objects does Python allocate during its interpreter lifetime?

It can be very surprising to see how many objects Python interpreter temporarily allocates while executing simple scripts. In fact, Python provides a way to check it.

To do so, we need to compile a standard CPython interpreter with additional debug flags:

./configure CFLAGS='-DCOUNT_ALLOCS' --with-pydebug 
make -s -j2

Let's open an empty interactive REPL and check allocation statistics:

>>> import sys
>>> sys.getcounts()
[('iterator', 7, 7, 4), ('functools._lru_cache_wrapper', 1, 0, 1), ('re.Match', 2, 2 ...

June 19, 2018 04:25 AM


Vladimir Iakolev

Filmstrip from subtitles and stock images

It’s possible to find subtitles for almost every movie or TV series. And there’s also stock images with anything imaginable. Wouldn’t it be fun to connect this two things and make a sort of a filmstrip with a stock image for every caption from subtitles?

TLDR: the result is silly:

For the subtitles to play with I chose subtitles for Bob’s Burgers – The Deeping. At first, we need to parse it with pycaption:

from pycaption.srt import SRTReader

lang = 'en-US'
path = 'burgers.srt'

def read_subtitles(path, lang):
    with open(path) as f:
        data = f.read()
        return SRTReader().read(data, lang=lang)
        
        
subtitles = read_subtitles(path, lang)
captions = subtitles.get_captions(lang)
>>> captions
['00:00:04.745 --> 00:00:06.746\nShh.', '00:00:10.166 --> 00:00:20.484\n...

As a lot of subtitles contains html, it’s important to remove tags before future processing, it’s very easy to do with lxml:

import lxml.html

def to_text(raw_text):
    return lxml.html.document_fromstring(raw_text).text_content()
to_text('<i>That shark is ruining</i>')
'That shark is ruining'

For finding most significant words in the text we need to tokenize it, lemmatize (replace every different form of a word with a common form) and remove stop words. It’s easy to do with NLTK:

from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

def tokenize_lemmatize(text):
    tokens = word_tokenize(text)
    lemmatizer = WordNetLemmatizer()
    lemmatized = [lemmatizer.lemmatize(token.lower())
                  for token in tokens if token.isalpha()]
    stop_words = set(stopwords.words("english"))
    return [lemma for lemma in lemmatized if lemma not in stop_words]
>>> tokenize_lemmatize('That shark is ruining')
['shark', 'ruining']

And after that we can just combine the previous two functions and find most frequently used words:

from collections import Counter

def get_most_popular(captions):
    full_text = '\n'.join(to_text(caption.get_text()) for caption in captions)
    tokens = tokenize_lemmatize(full_text)
    return Counter(tokens)
    
  
most_popular = get_most_popular(captions)
most_popular
Counter({'shark': 68, 'oh': 32, 'bob': 29, 'yeah': 25, 'right': 20,...

It’s not the best way to find the most important words, but it kind of works.

After that it’s straightforward to extract keywords from a single caption:

def get_keywords(most_popular, text, n=2):
    tokens = sorted(tokenize_lemmatize(text), key=lambda x: -most_popular[x])
    return tokens[:n]
>>> captions[127].get_text()
'Teddy, what is wrong with you?'
>>> get_keywords(most_popular, to_text(captions[127].get_text()))
['teddy', 'wrong']

The next step is to find a stock image for those keywords. There’s not that many properly working and documented stocks, so I chose to use Shutterstock API. It’s limited to 250 requests per hour, but it’s enough to play.

From their API we only need to use /images/search. We will search for the most popular photo:

import requests

# Key and secret of your app
stock_key = ''
stock_secret = ''

def get_stock_image_url(query):
    response = requests.get(
        "https://api.shutterstock.com/v2/images/search",
        params={
            'query': query,
            'sort': 'popular',
            'view': 'minimal',
            'safe': 'false',
            'per_page': '1',
            'image_type': 'photo',
        },
        auth=(stock_key, stock_secret),
    )
    data = response.json()
    try:
        return data['data'][0]['assets']['preview']['url']
    except (IndexError, KeyError):
        return None
>>> get_stock_image_url('teddy wrong')
'https://image.shutterstock.com/display_pic_with_logo/2780032/635833889/stock-photo-guilty-boyfriend-asking-for-forgiveness-presenting-offended-girlfriend-a-teddy-bear-toy-lady-635833889.jpg'

The image looks relevant:

teddy wrong

Now we can create a proper card from a caption:

def make_slide(most_popular, caption):
    text = to_text(caption.get_text())
    if not text:
        return None

    keywords = get_keywords(most_popular, text)
    query = ' '.join(keywords)
    if not query:
        return None

    stock_image = get_stock_image_url(query)
    if not stock_image:
        return None

    return text, stock_image
make_slide(most_popular, captions[132])
('He really chewed it...\nwith his shark teeth.', 'https://image.shutterstock.com/display_pic_with_logo/181702384/710357305/stock-photo-scuba-diver-has-shark-swim-really-close-just-above-head-as-she-faces-camera-below-710357305.jpg')

The image is kind of relevant:

He really chewed it...with his shark teeth.

After that we can select captions that we want to put in our filmstrip and generate html like the one in the TLDR section:

output_path = 'burgers.html'
start_slide = 98
end_slide = 200


def make_html_output(slides):
    html = '<html><head><link rel="stylesheet" href="./style.css"></head><body>'
    for (text, stock_image) in slides:
        html += f'''<div class="box">
            <img src="{stock_image}" />
            <span>{text}</span>
        </div>'''
    html += '</body></html>'
    return html


interesting_slides = [make_slide(most_popular, caption)
                      for caption in captions[start_slide:end_slide]]
interesting_slides = [slide for slide in interesting_slides if slide]

with open(output_path, 'w') as f:
    output = make_html_output(interesting_slides)
    f.write(output)

And the result - burgers.html.

Another example, even worse and a bit NSFW, It’s Always Sunny in Philadelphia – Charlie Catches a Leprechaun.

Gist with the sources.

June 19, 2018 12:23 AM

June 18, 2018


Django Weblog

Django 2.1 beta 1 released

Django 2.1 beta 1 is now available. It represents the second stage in the 2.1 release cycle and is an opportunity for you to try out the changes coming in Django 2.1.

Django 2.1 has a smorgasbord of new features which you can read about in the in-development 2.1 release notes.

Only bugs in new features and regressions from earlier versions of Django will be fixed between now and 2.1 final (also, translations will be updated following the "string freeze" when the release candidate is issued). The current release schedule calls for a release candidate in a month from now with the final release to follow about two weeks after that around August 1. Early and often testing from the community will help minimize the number of bugs in the release. Updates on the release schedule schedule are available on the django-developers mailing list.

As with all beta and beta packages, this is not for production use. But if you'd like to take some of the new features for a spin, or to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the beta package from our downloads page or on PyPI.

The PGP key ID used for this release is Tim Graham: 1E8ABDC773EDE252.

June 18, 2018 11:58 PM


NumFOCUS

NumFOCUS 2018 Google Summer of Code Cohort, Part 2

The post NumFOCUS 2018 Google Summer of Code Cohort, Part 2 appeared first on NumFOCUS.

June 18, 2018 02:30 PM


Real Python

The Ultimate List of Python YouTube Channels

We couldn’t find a good, up-to-date list of Python developer or Python programming YouTube channels online.

Learning Python on YouTube is a viable option these days, and we’re excited about what this new medium can do for programming education.

There are some really good YouTube channels that focus on Python development out there, but we just couldn’t find a list that was both comprehensive and up-to-date. So we created our own with the best and most Pythonic YouTubers.

We initially wrote this list based on information we gathered by reading forum posts and searching YouTube for Python channels directly. We’ll continue to add to the list with your feedback. We plan to keep this list updated, so feel free to leave a comment at the end of the page or tweet at us if you think anything is missing or if you’d like to see your own YouTube Python tutorials added.

In order for a channel to be included in our list, it must:

  • Focus on Python tutorials
  • Not be brand-new (> 2,000 subscribers)
  • Be active (new videos are coming out) OR have an interesting archive with old content worth watching

Enjoy the Python goodness! 📹🐍

Al Sweigart

“Tons of sweet computer related tutorials and some other awesome videos too!”

Anaconda Inc.

“With over 4.5 million users, Anaconda is the world’s most popular Python data science platform. Anaconda, Inc. continues to lead open source projects like Anaconda, NumPy and SciPy that form the foundation of modern data science. Anaconda’s flagship product, Anaconda Enterprise, allows organizations to secure, govern, scale and extend Anaconda to deliver actionable insights that drive businesses and industries forward.”

In addition to their company developed videos, including a fun lego-mation and Pyception short, Anaconda’s YouTube channel contains all the videos from AnacondaCon, a gathering of the passionate community of data scientists, IT professionals, analysts, developers, and business leaders all using the Anaconda distribution.

Clever Programmer

“You can find awesome programming lessons here! Also, expect programming tips and tricks that will take your coding skills to the next level.”

CodingEntrepreneurs

“Coding for Entrepreneurs is a Programming Series for Non-Technical Founders. Learn Django, Python, APIs, Accepting Payments, Stripe, jQuery, Twitter Bootstrap, and much more.”

Corey Schafer

“This channel is focused on creating tutorials and walkthroughs for software developers, programmers, and engineers. We cover topics for all different skill levels, so whether you are a beginner or have many years of experience, this channel will have something for you.”

Chris Hawkes

“We’re going to learn about programming, web design, responsive web design, Reactjs, Django, Python, web scraping, games, forms applications and more!”

CS Dojo

“Hey everyone! My name is YK, and I make videos mostly about programming and computer science here.”

Data School (Kevin Markham)

“You’re trying to launch your career in data science, and I want to help you reach that goal! My in-depth tutorials will help you to master crucial data science topics using open source tools like Python and R.”

David Beazley

“An archive of David Beazley’s conference, user group, and training talks.”

Enthought

“For more than 15 years, Enthought has built AI solutions with science and engineering at the core. We accelerate digital transformation by enabling companies and their people to leverage the benefits of Artificial Intelligence and Machine Learning.”

Additionally, Enthought is best known for the early development, maintenance, and continued support of SciPy, as well as the primary sponsor for the SciPy US and EuroSciPy Conferences. In addition to the company developed content, this channel provides all the video recordings from the SciPy US and EuroScipy (before 2016) Conferences, talks and tutorials specifically focused on the advancement of scientific computing through open source Python software for mathematics, science, and engineering.

Michael Kennedy (Talk Python)

“Videos, demos, and lectures about programming - especially Python and web topics.”

Practical Python

“Expect unexpected.”

PrettyPrinted

“I’m Anthony. I make programming videos.”

PyCon Session Recordings

These are all PyCon talk and session recordings made available on YouTube. There’s no single channel that combines these. Instead, you’ll access each year’s videos on a separate “PyCon 20…” channel. Alternatively, you can use PyVideo.org to watch the session recordings.

PyData

“PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.”

Python Training by Dan Bader

“Python tutorials and training videos for Pythonistas that go beyond the basics. On this channel you’ll get new Python videos and screencasts every week. They’re bite-sized and to the point so you can fit them in with your day and pick up new Python skills on the side.”

Sentdex (Harrison Kinsley)

“Python Programming tutorials, going further than just the basics. Learn about machine learning, finance, data analysis, robotics, web development, game development and more.”

Siraj Raval

“I’m Siraj. I’m on a warpath to inspire and educate developers to build Artificial Intelligence. Games, music, chatbots, art, i’ll teach you how to make it all yourself.”

Socratica

“Socratica makes high-quality educational videos on math and science. New videos every week! We’re a couple of Caltech grads who believe you deserve better videos. You’ll learn more with us!”

TheNewBoston (Bucky Roberts)

“Tons of sweet computer related tutorials and some other awesome videos too!”

Smaller Python Conferences

The following channels provide the tutorials, talks, and session recordings from many of the smaller local Python conferences held throughout the world.

Though on their own, most of these channels do not meet the requirement for 2000 subscribers, we list them here as honorable mentions because they represent the diverse Python community throughout the world.

Note that some (maybe older) videos from these conferences are available (together with other non-Python content) on the Next Day Video and Engineers.SG channels. Alternatively, PyVideo.org can serve as a one-stop-shop where you can find most (but not all) of these session recordings.

If you think anything is missing from this list or if you’d like to see your own Python YouTube channel added, then please leave a comment below or tweet at us.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 18, 2018 02:00 PM


Mike Driscoll

ReportLab: PDF Publishing with Python is now Available!

My latest book, ReportLab: PDF Processing with Python is now available for purchase.

ReportLab has been around since the year 2000 and has remained the primary package that Python developers use for creating reports in the PDF format. It is an extremely powerful package that works across all the major platforms. This book will also introduce the reader to other Python PDF packages.

You can get the book at the following online retailers:

June 18, 2018 12:30 PM


Andre Roberge

Approximate fun


Newest addition to https://github.com/aroberge/experimental


> python -m experimental                                                
experimental console version 0.9.6. [Python version: 3.6.1]             
                                                                        
~~> from __experimental__ import approx                                 
~~> 0.1 + 0.2                                                           
0.30000000000000004                                                     
~~> 0.1 + 0.2 == 0.3                                                    
False                                                                   
~~> # Attempt to use approximate comparison with defining tolerances    
~~> 0.1 + 0.2 ~= 0.3                                                    
Traceback (most recent call last):                                      
  File "<console>", line 1, in <module>                                 
NameError: name 'rel_tol' is not defined                                
~~> rel_tol = abs_tol = 1e-8                                            
~~> 0.1 + 0.2 ~= 0.3                                                    
True                                                                    
~~> 2**0.5 ~= 1.414                                                     
False                                                                   
~~> abs_tol = 0.001                                                     
~~> 2**0.5 ~= 1.414                                                     
True                                                   

June 18, 2018 08:36 AM


Mike Driscoll

PyDev of the Week: Qumisha Goss

This week we welcome Qumisha Goss (@QatalystGoss) as our PyDev of the Week. Q is a librarian from Detroit who gave one of the best keynotes I’ve ever seen at PyCon US this year. For some reason, the people who uploaded the Keynotes from that morning didn’t separate the keynotes from each other or from the morning’s lightning talks, so you have to seek about 2/3’s of the way through the official video to find Q’s keynote here:

I personally think you should take a few moments and watch the video. But if you don’t have the time, you can still read this brief interview with this amazing person.

Can you tell us a little about yourself (hobbies, education, etc):

Qumisha Goss, I go by Q. I’m a Librarian at the Detroit Public Library. I studied History and Classical Studies at Calvin College. I was obsessed with Mythology and then with engineering of the Roman Empire. I wanted to Engineer and then an Archivist, and now I’m a librarian.

Why did you start using Python?

I started using python after I was encouraged to start a Kids’s programming class at the Library. I started out using hour of Code and Code.org resources, but as the kids got bored with that I taught myself Python to be able to teach them something a little harder and more resilient.

What other programming languages do you know and which is your favorite?

Python is really my only Language I have played around with SQL a little bit though for regular library business.

What projects are you working on now?

Currently I’m working on a Parkman Coders summer Program. This summer the program is called Code:Grow where we will be encouraging the kids to go outside and plant a garden and then use Code to monitor their garden by making time lapse camera, and programming soil moisture sensors.

Which Python libraries are your favorite (core or 3rd party)?

  • PyGame because I work with Kids.
  • SQLalchemy

What top three things have you learned contributing to open source projects?

  • Python Community is very encouraging and
  • No one knows everything
  • It’s okay to ask for help and get it.

What is your motivation for working in open source?

I personally believe that learning should be free and open, and open source gives the opportunity to level the playing field for some who may not normally have the opportunity to access the resources they need.

Is there anything else you’d like to say?

we work a lot with physical computing at the Library because we found that having something besides a computer to put their hands on encourages the learning process. So we often use Raspberry Pis and Micro bits.

Thanks for doing the interview!

June 18, 2018 05:05 AM

June 17, 2018


Marcos Dione

identity-countries-languages-and-currencies

I started watching PyCon's videos. One of the first ones I saw is Amber Brown's "How we do identity wrong". I think she[1] is right in raising not only the notion of not assuming things related to names, addresses and ID numbers, but also that you shouldn't be collecting information that you don't need; at some point, it becomes a liability.

In the same vein about assuming, I have more examples. One of them is deciding what language you show your site depending on what country the client connects form. I'm not a millennial (more like a transmillennial, if you push me to it), but I tend to go places. Every time I go to a new place, I get sites in new languages, but maps in US!

Today I wanted to book a hotel room. The hotel's site asked me where do I live, so I chose France. Fact is, for them country and language is the same thing (I wonder what would happen if I answer Schweiz/Suisse/Svizzera/Svizra), so I can't say that I live in France but prefer English, so I chose United Kingdom instead. Of course, this also meant that I got prices in GBP, not EUR, so I had to correct that one too. At least I could.

Later they asked me country of residence and nationality; when I chose italian, the country was set to Italia, even when I chose France first!

I leave you all with an anecdote. As I said, I lake to go places, most of the times with friends. Imagine the puzzled expression of the police officer that stopped us to find a car licensed in France, driven by an italian, with an argentinian, a spanish and a chilean passangers, crossing from Austria to Slovakia, listening to US music. I only forgot to put the GPS in japanese or something.

So, don't assume; if you assume, let the user change settings to their preferences, and don't ask for data you don't actually need. And please use the user's Accept-Language header; they have it for a reason.


[1] I think that's the pronoun she[1] said she[1] preferred. I'm sorry if I got that wrong.


python misc

June 17, 2018 04:06 PM