skip to navigation
skip to content

Planet Python

Last update: October 27, 2016 09:47 PM

October 27, 2016

Peter Bengtsson

Django test optimization with no-op PIL engine

The Air Mozilla project is a regular Django webapp. It's reasonably big for a more or less one man project. It's ~200K lines of Python and ~100K lines of JavaScript. There are 816 "unit tests" at the time of writing. Most of them are kinda typical Django tests. Like:

def test_some_feature(self):
    thing = MyModel.objects.create(key='value')
    url = reverse('namespace:name', args=(,))
    response = self.client.get(url)

Also, the site uses sorl.thumbnail to automatically generate thumbnails from uploaded images. It's a great library.

However, when running tests, you almost never actually care about the image itself. Your eyes will never feast on them. All you care about is that there is an image, that it was resized and that nothing broke. You don't write tests that checks the new image dimensions of a generated thumbnail. If you need tests that go into that kind of detail, it best belongs somewhere else.

So, I thought, why not fake ALL operations that are happening inside sorl.thumbnail to do with resizing and cropping images.

Here's the changeset that does it. Note, that the trick is to override the default THUMBNAIL_ENGINE that sorl.thumbnail loads. It usually defaults to sorl.thumbnail.engines.pil_engine.Engine and I just wrote my own that does no-ops in almost every instance.

I admittedly threw it together quite quickly just to see if it was possible. Turns out, it was.

# Depends on setting something like:
#    THUMBNAIL_ENGINE = 'airmozilla.base.tests.testbase.FastSorlEngine'
# in your settings specifically for running tests.

from sorl.thumbnail.engines.base import EngineBase

class _Image(object):
    def __init__(self):
        self.size = (1000, 1000)
        self.mode = 'RGBA' = '\xa0'

class FastSorlEngine(EngineBase):

    def get_image(self, source):
        return _Image()

    def get_image_size(self, image):
        return image.size

    def _colorspace(self, image, colorspace):
        return image

    def _scale(self, image, width, height):
        image.size = (width, height)
        return image

    def _crop(self, image, width, height, x_offset, y_offset):
        image.size = (width, height)
        return image

    def _get_raw_data(self, image, *args, **kwargs):

    def is_valid_image(self, raw_data):
        return bool(raw_data)

So, was it much faster?

It's hard to measure because the time it takes to run the whole test suite depends on other stuff going on on my laptop during the long time it takes to run the tests. So I ran them 8 times with the old code and 8 times with this new hack.

Iteration Before After
1 82.789s 73.519s
2 82.869s 67.009s
3 77.100s 60.008s
4 74.642s 58.995s
5 109.063s 80.333s
6 100.452s 81.736s
7 85.992s 61.119s
8 82.014s 73.557s
Average 86.865s 69.535s
Median 82.869s 73.519s
Std Dev 11.826s 9.0757s

So rougly 11% faster. Not a lot but it adds up when you're doing test-driven development or debugging where you run a suite or a test over and over as you're saving the files/tests you're working on.

Room for improvement

In my case, it just worked with this simple solution. Your site might do fancier things with the thumbnails. Perhaps we can combine forces on this and finalize a working solution into a standalone package.

October 27, 2016 12:34 PM

Glyph Lefkowitz

What Am Container

Perhaps you are a software developer.

Perhaps, as a developer, you have recently become familiar with the term "containers".

Perhaps you have heard containers described as something like "LXC, but better", "an application-level interface to cgroups" or "like virtual machines, but lightweight", or perhaps (even less usefully), a function call. You've probably heard of "docker"; do you wonder whether a container is the same as, different from, or part of an Docker?

Are you are bewildered by the blisteringly fast-paced world of "containers"? Maybe you have no trouble understanding what they are - in fact you might be familiar with a half a dozen orchestration systems and container runtimes already - but frustrated because this seems like a whole lot of work and you just don't see what the point of it all is?

If so, this article is for you.

I'd like to lay out what exactly the point of "containers" are, why people are so excited about them, what makes the ecosystem around them so confusing. Unlike my previous writing on the topic, I'm not going to assume you know anything about the ecosystem in general; just that you have a basic understanding of how UNIX-like operating systems separate processes, files, and networks.1

At the dawn of time, a computer was a single-tasking machine. Somehow, you'd load your program into main memory, and then you'd turn it on; it would run the program, and (if you're lucky) spit out some output onto paper tape.

When a program running on such a computer looked around itself, it could "see" the core memory of the computer it was running on, any attached devices, including consoles, printers, teletypes, or (later) networking equipment. This was of course very powerful - the program had full control of everything attached to the computer - but also somewhat limiting.

This mode of addressing hardware is limiting because it meant that programs would break the instant you moved them to a new computer. They had to be re-written to accommodate new amounts and types of memory, new sizes and brands of storage, new types of networks. If the program had to contain within itself the full knowledge of every piece of hardware that it might ever interact with, it would be very expensive indeed.

Also, if all the resources of a computer were dedicated to one program, then you couldn't run a second program without stomping all over the first one - crashing it by mangling its structures in memory, deleting its data by overwriting its data on disk.

So, programmers cleverly devised a way of indirecting, or "virtualizing", access to hardware resources. Instead of a program simply addressing all the memory in the whole computer, it got its own little space where it could address its own memory - an address space, if you will. If a program wanted more memory, it would ask a supervising program - what we today call a "kernel" - to give it some more memory. This made programs much simpler: instead of memorizing the address offsets where a particular machine kept its memory, a program would simply begin by saying "hey operating system, give me some memory", and then it would access the memory in its own little virtual area.

In other words: memory allocation is just virtual RAM.

Virtualizing memory - i.e. ephemeral storage - wasn't enough; in order to save and transfer data, programs also had to virtualize disk - i.e. persistent storage. Whereas a whole-computer program would just seek to position 0 on the disk and start writing data to it however it pleased, a program writing to a virtualized disk - or, as we might call it today, a "file" - first needed to request a file from the operating system.

In other words: file systems are just virtual disks.

Networking was treated in a similar way. Rather than addressing the entire network connection at once, each program could allocate a little slice of the network - a "port". That way a program could, instead of consuming all network traffic destined for the entire machine, ask the operating system to just deliver it all the traffic for, say, port number seven.

In other words: listening ports are just virtual network cards.

Getting bored by all this obvious stuff yet? Good. One of the things that frustrates me the most about containers is that they are an incredibly obvious idea that is just a logical continuation of a trend that all programmers are intimately familiar with.

All of these different virtual resources exist for the same reason: as I said earlier, if two programs need the same resource to function properly, and they both try to use it without coordinating, they'll both break horribly.2

UNIX-like operating systems more or less virtualize RAM correctly. When one program grabs some RAM, nobody else - modulo super-powered administrative debugging tools - gets to use it without talking to that program. It's extremely clear which memory belongs to which process. If programs want to use shared memory, there is a very specific, opt-in protocol for doing so; it is basically impossible for it to happen by accident.

However, the abstractions we use for disks (filesystems) and network cards (listening ports and addresses) are significantly more limited. Every program on the computer sees the same file-system. The program itself, and the data the program stores, both live on the same file-system. Every program on the computer can see the same network information, can query everything about it, and can receive arbitrary connections. Permissions can remove certain parts of the filesystem from view (i.e. programs can opt-out) but it is far less clear which program "owns" certain parts of the filesystem; access must be carefully controlled, and sometimes mediated by administrators.

In particular, the way that UNIX manages filesystems creates an environment where "installing" a program requires manipulating state in the same place (the filesystem) where other programs might require different state. Popular package managers on UNIX-like systems (APT, RPM, and so on) rarely have a way to separate program installation even by convention, let alone by strict enforcement. If you want to do that, you have to re-compile the software with ./configure --prefix to hard-code a new location. And, fundamentally, this is why the package managers don't support installing to a different place: if the program can tell the difference between different installation locations, then it will, because its developers thought it should go in one place on the file system, and why not hard code it? It works on their machine.

In order to address this shortcoming of the UNIX process model, the concept of "virtualization" became popular. The idea of virtualization is simple: you write a program which emulates an entire computer, with its own storage media, network devices, and then you install an operating system on it. This completely resolves the over-sharing of resources: a process inside a virtual machine is in a very real sense running on a different computer than programs running on a different virtual machine on the same physical device.

However, virtualiztion is also an extremly heavy-weight blunt instrument. Since virtual machines are running operating systems designed for physical machines, they have tons of redundant hardware-management code; enormous amounts of operating system data which could be shared with the host, but since it's in the form of a disk image totally managed by the virtual machine's operating system, the host can't really peek inside to optimize anything. It also makes other kinds of intentional resource sharing very hard: any software to manage the host needs to be installed on the host, since if it is installed on the guest it won't have full access to the host's hardware.

I hate using the term "heavy-weight" when I'm talking about software - it's often bandied about as a content-free criticism - but the difference in overhead between running a virtual machine and a process is the difference between gigabytes and kilobytes; somewhere between 4-6 orders of magnitude. That's a huge difference.

This means that you need to treat virtual machines as multi-purpose, since one VM is too big to run just a single small program. Which means you often have to manage them almost as if they were physical harware.

When we run a program on a UNIX-like operating system, and by so running it, grant it its very own address space, we call the entity that we just created a "process".

This is how to understand a "container".

A "container" is what we get when we run a program and give it not just its own memory, but its own whole virtual filesystem and its own whole virtual network card.

The metaphor to processes isn't perfect, because a container can contain multiple processes with different memory spaces that share a single filesystem. But this is also where some of the "container ecosystem" fervor begins to creep in - this is why people interested in containers will religiously exhort you to treat a container as a single application, not to run multiple things inside it, not to SSH into it, and so on. This is because the whole point of containers is that they are lightweight - far closer in overhead to the size of a process than that of a virtual machine.

A process inside a container, if it queries the operating system, will see a computer where only it is running, where it owns the entire filesystem, and where any mounted disks were explicitly put there by the administrator who ran the container. In other words, if it wants to share data with another application, it has to be given the shared data; opt-in, not opt-out, the same way that memory-sharing is opt-in in a UNIX-like system.

So why is this so exciting?

In a sense, it really is just a lower-overhead way to run a virtual machine, as long as it shares the same kernel. That's not super exciting, by itself.

The reason that containers are more exciting than processes is the same reason that using a filesystem is more exciting than having to use a whole disk: sharing state always, inevitably, leads to brokenness. Opt-in is better than opt-out.

When you give a program a whole filesystem to itself, sharing any data explicitly, you eliminate even the possibility that some other program scribbling on a shared area of the filesystem might break it. You don't need package managers any more, only package installers; by removing the other functions of package managers (inventory, removal) they can be radically simplified, and less complexity means less brokenness.

When you give a program an entire network address to itself, exposing any ports explicitly, you eliminate even the possibility that some rogue program will expose a security hole by listening on a port you weren't expecting. You eliminate the possibility that it might clash with other programs on the same host, hard-coding the same port numbers or auto-discovering the same addresses.

In addition to the exciting things on the run-time side, containers - or rather, the things you run to get containers, "images"3, present some compelling improvements to the build-time side.

On Linux and Windows, building a software artifact for distribution to end-users can be quite challenging. It's challenging because it's not clear how to specify that you depend on certain other software being installed; it's not clear what to do if you have conflicting versions of that software that may not be the same as the versions already available on the user's computer. It's not clear where to put things on the filesystem. On Linux, this often just means getting all of your software from your operating system distributor.

You'll notice I said "Linux and Windows"; not the usual (linux, windows, mac) big-3 desktop platforms, and I didn't say anything about mobile OSes. That's because on macOS, Android, iOS, and Windows Metro, applications already run in their own containers. The rules of macOS containers are a bit weird, and very different from Docker containers, but if you have a Mac you can check out ~/Library/Containers to see the view of the world that the applications you're running can see. iOS looks much the same.

This is something that doesn't get discussed a lot in the container ecosystem, partially because everyone is developing technology at such a breakneck pace, but in many ways Linux server-side containerization is just a continuation of a trend that started on mainframe operating systems in the 1970s and has already been picked up in full force by mobile operating systems.

When one builds an image, one is building a picture of the entire filesystem that the container will see, so an image is a complete artifact. By contrast, a package for a Linux package manager is just a fragment of a program, leaving out all of its dependencies, to be integrated later. If an image runs on your machine, it will (except in some extremely unusual circumstances) run on the target machine, because everything it needs to run is fully included.

Because you build all the software an image requires into the image itself, there are some implications for server management. You no longer need to apply security updates to a machine - they get applied to one application at a time, and they get applied as a normal process of deploying new code. Since there's only one update process, which is "delete the old container, run a new one with a new image", updates can roll out much faster, because you can build an image, run tests for the image with the security updates applied, and be confident that it won't break anything. No more scheduling maintenance windows, or managing reboots (at least for security updates to applications and libraries; kernel updates are a different kettle of fish).

That's why it's exciting. So why's it all so confusing?5

Fundamentally the confusion is caused by there just being way too many tools. Why so many tools? Once you've accepted that your software should live in images, none of the old tools work any more. Almost every administrative, monitoring, or management tool for UNIX-like OSes depends intimately upon the ability to promiscuously share the entire filesystem with every other program running on it. Containers break these assumptions, and so new tools need to be built. Nobody really agrees on how those tools should work, and a wide variety of forces ranging from competitive pressure to personality conflicts make it difficult for the panoply of container vendors to collaborate perfectly4.

Many companies whose core business has nothing to do with infrastructure have gone through this reasoning process:

  1. Containers are so much better than processes, we need to start using them right away, even if there's some tooling pain in adopting them.
  2. The old tools don't work.
  3. The new tools from the tool vendors aren't ready.
  4. The new tools from the community don't work for our use-case.
  5. Time to write our own tool, just for our use-case and nobody else's! (Which causes problem #3 for somebody else, of course...)

A less fundamental reason is too much focus on scale. If you're running a small-scale web application which has a stable user-base that you don't expect a lot of growth in, there are many great reasons to adopt containers as opposed to automating your operations; and in fact, if you keep things simple, the very fact that your software runs in a container might obviate the need for a system-management solution like Chef, Ansible, Puppet, or Salt. You should totally adopt them and try to ignore the more complex and involved parts of running an orchestration system.

However, containers are even more useful at significant scale, which means that companies which have significant scaling problems invest in containers heavily and write about them prolifically. Many guides and tutorials on containers assume that you expect to be running a multi-million-node cluster with fully automated continuous deployment, blue-green zero-downtime deploys, a 1000-person operations team. It's great if you've got all that stuff, but building each of those components is a non-trivial investment.

So, where does that leave you, my dear reader?

You should absolutely be adopting "container technology", which is to say, you should probably at least be using Docker to build your software. But there are other, radically different container systems - like Sandstorm - which might make sense for you, depending on what kind of services you create. And of course there's a huge ecosystem of other tools you might want to use; too many to mention, although I will shout out to my own employer's docker-as-a-service Carina, which delivered this blog post, among other things, to you.

You shouldn't feel as though you need to do containers absolutely "the right way", or that the value of containerization is derived from adopting every single tool that you can all at once. The value of containers comes from four very simple things:

  1. It reduces the overhead and increases the performance of co-locating multiple applications on the same hardware,
  2. It forces you to explicitly call out any shared state or required resources,
  3. It creates a complete build pipeline that results in a software artifact that can be run without special installation or set-up instructions (at least, on the "software installation" side; you still might require configuration, of course), and
  4. It gives you a way to test exactly what you're deploying.

These benefits can combine and interact in surprising and interesting ways, and can be enhanced with a wide and growing variety of tools. But underneath all the hype and the buzz, the very real benefit of containerization is basically just that it is fixing a very old design flaw in UNIX.

Containers let you share less state, and shared mutable state is the root of all evil.

  1. If you have a more sophisticated understanding of memory, disks, and networks, you'll notice that everything I'm saying here is patently false, and betrays an overly simplistic understanding of the development of UNIX and the complexities of physical hardware and driver software. Please believe that I know this; this is an alternate history of the version of UNIX that was developed on platonically ideal hardware. The messy co-evolution of UNIX, preemptive multitasking, hardware offload for networks, magnetic secondary storage, and so on, is far too large to fit into the margins of this post. 

  2. When programs break horribly like this, it's called "multithreading". I have written some software to help you avoid it. 

  3. One runs an "executable" to get a process; one runs an "image" to get a container. 

  4. Although the container ecosystem is famously acrimonious, companies in it do actually collaborate better than the tech press sometimes give them credit for; the Open Container Project is a significant extraction of common technology from multiple vendors, many of whom are also competitors, to facilitate a technical substrate that is best for the community. 

  5. If it doesn't seem confusing to you, consider this absolute gem from the hilarious folks over at CircleCI. 

October 27, 2016 09:23 AM

Talk Python to Me

#82 Grokking Algorithms in Python

Algorithms underpin almost everything we do in programming and in problem solving in general. Yet, many of us have partial or incomplete knowledge of the most important and common ones. In this episode, you'll meet Adit Bhargava, the author of the light and playful Grokking Algorithms: An illustrated guide book. <br/> <br/> If you struggled to understand and learn the key algorithms, this episode is for you. <br/> <br/> Links from the show: <br/> <div style="font-size: .85em;"> <br/> <b>Adit on the web</b>: <a href='' target='_blank'></a> <br/> <b>Book: Grokking Algorithms: An illustrated guide</b>: <a href='' target='_blank'></a> <br/> <b>Grokking Algorithms GitHub</b>: <a href='' target='_blank'></a> <br/> <b>Adit on Twitter</b>: <a href='' target='_blank'>@_egonschiele</a> <br/> <b>High perf search of Talk Python</b>: <a href='' target='_blank'></a> <br/> </div>

October 27, 2016 08:00 AM

Gocept Weblog

Towards RestrictedPython 3

The biggest blocker to port Zope to Python 3 is RestrictedPython.

What is RestrictedPython?

It is a library used by Zope to restrict Python code at instruction level to a bare minimum of trusted functionality. It parses and filters the code for not allowed constructs (such as open()) and adds wrappers around each access on attributes or items. These wrappers can be used by Zope to enforce access control on objects in the ZODB without requiring manual checks in the code.

Why is RestrictedPython needed?

Zope allows writing Python code in the Zope management interface (ZMI) using a web browser (“through the web” aka TTW). This code is stored in the ZODB. The code is executed on the server. It would be dangerous to allow a user to execute arbitrary code with the rights of the web server process. That’s why the code is filtered through RestrictedPython to make sure this approach is not a complete security hole.

RestrictedPython is used in many places of Zope as part of its security model. An experiment on the Zope Resurrection Sprint showed that it would be really hard to create a Zope version which does not need RestrictedPython thus removing the TTW approach.

What is the problem porting RestrictedPython to Python 3?

RestrictedPython relies on the compiler package of the Python standard library. This package no longer exists in Python 3 because it was poorly documented, unmaintained and out of sync with the compiler Python uses itself. (There are whisperings that it was only kept because of Zope.)

Since Python 2.6 there is a new ast module in the Python standard library which is not a direct replacement for compiler. There is no documentation how to replace compiler by ast.

What is the current status?

Several people already worked on various Plone and Zope sprints and mostly in their spare time on a Python 3 branch of RestrictedPython to find out how this package works and to start porting some of its functionality as a proof of concept. It seems to be possible to use ast as the new base for RestrictedPython. Probably the external API of RestrictedPython could be kept stable. But packages using or extending some of the internals of RestrictedPython might need to be updated as well.

What are the next steps?

Many Zope and Plone packages depend on RestrictedPython directly (like AccessControl or Products.ZCatalog) or indirectly (like Products.PythonScripts, or even

When RestrictedPython has successfully been tested against these packages porting them can start. There is a nice list of all Plone 5.1 dependencies and their status regarding Python 3.

Our goal is to complete porting RestrictedPython by the end of March 2017. It opens up the possibility guiding Zope into the Python 3 wonderland by the end of 2017. This is ambitious, especially if the work is done in spare time besides the daily customer work. You can help us by either contributing PullRequests via Github or review them.

We are planning two Zope sprints in spring and autumn 2017. Furthermore we are grateful for each and every kind of support.

October 27, 2016 06:26 AM

Full Stack Python

Dialing Outbound Phone Calls with a Bottle Web App

Python web apps built with the Bottle web framework can send and receive SMS text messages. In this tutorial we will go beyond texting and learn how to dial outbound phone calls. The calls will read a snippet of text then play an MP3 file, but they can then be easily modified to create conference lines and many other voice features in your Python web apps.

Tools We Need

You should have either Python 2 or 3 installed to create your Bottle app, although Python 3 is recommended for new applications. We also need:

Take a look at this guide on setting up Python 3, Bottle and Gunicorn on Ubuntu 16.04 LTS if you need help getting your development environment configured before continuing on through the remainder of this tutorial.

You can snag all the open source code for this tutorial in the python-bottle-phone GitHub repository under the outbound directory. Use and copy the code however you want - it's all open source under the MIT license.

Installing Our Application Dependencies

Our Bottle app needs a helper code library to make it easy to dial outbound phone calls. Bottle and the Twilio helper library are installable from PyPI into a virtualenv. Open your terminal and use the virtualenv command to create a new virtualenv:

virtualenv bottlephone

Use the activate script within the virtualenv, which makes this virtualenv the active Python installation. Note that you need to do this in every terminal window that you want this virtualenv to be used.

source bottlephone/bin/activate

The command prompt will change after activating the virtualenv to something like (bottlephone) $. Here is a screenshot of what my environment looked like when I used the activate script.

Next use the pip command to install the Bottle and Twilio Python packages into your virtualenv.

pip install bottle twilio

After the installation script finishes, we will have the required dependencies to build our app. Time to write some Python code to dial outbound phone calls.

Bottle and Twilio

Our simple Bottle web app will have three routes:

We can build the structure of our Bottle app and the first route right now. Create a new file named with the following contents to start our app.

import os
import bottle
from bottle import route, run, post, Response
from twilio import twiml
from import TwilioRestClient

app = bottle.default_app()
# plug in account SID and auth token here if they are not already exposed as
# environment variables
twilio_client = TwilioRestClient()

TWILIO_NUMBER = os.environ.get('TWILIO_NUMBER', '+12025551234')
NGROK_BASE_URL = os.environ.get('NGROK_BASE_URL', '')

def index():
        Returns a standard text response to show the app is up and running.
    return Response("Bottle app running!")

if __name__ == '__main__':
    run(host='', port=8000, debug=False, reloader=True)

Make sure you are in the directory where you created the above file. Run the app via the Bottle development server with the following command. Make sure your virtualenv is still activated so our code can rely on the Bottle code library.


We should see a successful development server start up like this:

(bottlephone) matt@ubuntu:~/bottlephone$ python 
Bottle v0.12.9 server starting up (using WSGIRefServer())...
Listening on
Hit Ctrl-C to quit.

Here is what the development server message looks like in my environment on Ubuntu:

Successfully starting the Bottle development server from the command line.

Let's test out the app by going to localhost:8000 in the web browser. We should get a simple success message that the app is running and responding to requests.

Simple success message in the web browser that the Bottle app is running.

Next we need to obtain a phone number that our Bottle app can use to call other phone numbers.

Obtain a Phone Number

Our basic Bottle web app is running but what we really want to do is dial outbound calls - which will be handled by Twilio.

In your web browser go to the Twilio website and sign up for a free account. You can also sign into your existing Twilio account if you already have one.

Twilio sign up screen.

The Twilio trial account allows you to dial and receive phone calls to your own validated phone number. To dial and receive calls from any phone number then you need to upgrade your account (hit the upgrade button on the top navigation bar to do that). Trial accounts are great for initial development before your application goes live but upgraded accounts are where the real power comes in.

Once you are signed into your Twilio account, go to the manage phone numbers screen. On this screen you can buy one or more phone numbers or click on an existing phone number in your account to configure it.

Manage phone numbers screen.

There is nothing for us to configure right now on the phone number configuration page because we are making outbound phone calls for this tutorial. Now that we have a phone number in hand, let's add the final bit of code to our Bottle app to get this app working.

Making Phone Calls

We need to add two new routes to our Bottle app so it can dial outbound phone calls. Modify your existing file with the two new functions below, twiml_response and outbound_call. None of the other code in this file needs to change other than adding those two new functions to what we wrote in the previous section.

import os
import bottle
from bottle import route, run, post, Response
from twilio import twiml
from import TwilioRestClient

app = bottle.default_app()
# plug in account SID and auth token here if they are not already exposed as
# environment variables
twilio_client = TwilioRestClient()

# add your Twilio phone number here
TWILIO_NUMBER = os.environ.get('TWILIO_NUMBER', '+16093002984')
# plug in your Ngrok Forwarding URL - we'll set it up in a minute
NGROK_BASE_URL = os.environ.get('NGROK_BASE_URL', '')

def index():
        Returns a standard text response to show the app is up and running.
    return Response("Bottle app running!")

def twiml_response():
        Provides TwiML instructions in response to a Twilio POST webhook
        event so that Twilio knows how to handle the outbound phone call
        when someone picks up the phone.
    response = twiml.Response()
    response.say("Sweet, this phone call is answered by your Bottle app!")"", loop=10)
    return Response(str(response))

def outbound_call(outbound_phone_number):
        Uses the Twilio Python helper library to send a POST request to
        Twilio telling it to dial an outbound phone call from our specific
        Twilio phone number (that phone number must be owned by our
        Twilio account).
    # the url must match the Ngrok Forwarding URL plus the route defined in
    # the previous function that responds with TwiML instructions
    twilio_client.calls.create(to=OUTBOUND_NUMBER, from_=BLOG_POST_NUMBER,
                               url=NGROK_BASE_URL + '/twiml')
    return Response('phone call placed to ' + outbound_phone_number + '!')

if __name__ == '__main__':
    run(host='', port=8000, debug=False, reloader=True)

There is just one problem with our current setup if you're developing on a local environment: Twilio won't be able to reach that /twiml route. We need to deploy our app to a reachable server, or just use a localhost tunneling tool like Ngrok. Ngrok provides an external URL that connects to a port running on your machine. Download and install the Ngrok application that is appropriate for your operating system.

We run Ngrok locally and expose our Bottle app that is running on port 8000. Run this command within the directory where the Ngrok executable is located.

./ngrok http 8000

Ngrok will start up and provide us with a Forwarding URL, with both HTTP and HTTPS versions.

Ngrok started and running to serve as a localhost tunnel.

We can use the Forwarding URL to instruct Twilio how to handle the outbound phone call when someone picks up. Insert the Ngrok forwarding URL into the file where NGROK_BASE_URL is specified.

Paste the ngrok Forwarding URL into the Twilio webhook configuration text box.

If Ngrok is useful to you, make sure to read this 6 awesome reasons to use Ngrok when testing webhooks post to learn even more about the tool.

Time to test out our app, let's give it a quick spin.

Making Phone Calls

Make sure your Bottle development server is still running or re-run it with the python command in a shell where your virtualenv is still activated.

Bring up the application in a browser, this time test out the phone calling capabilities. Go to "localhost:8000/dial-phone/my-phone-number", where "my-phone-number" is a number in the "+12025551234" format. For example, here is what happens when I dialed +12023351278:

Dialing an outbound phone call with Bottle.

And here is the inbound phone call!

Receiving an incoming phone call on the iPhone.

When we pick up the phone call we also see the /twiml route get called via Ngrok.

/twiml route being called via Ngrok.

With just two routes in our Bottle app and Twilio we were able to make outbound phone calls. Not bad!

What's next?

Sweet, we can now dial outbound phone calls to any phone number from our Bottle web application. Next you may want to try one of these tutorials to add even more features to your app:

Questions? Contact me via Twitter @fullstackpython or @mattmakai. I'm also on GitHub as mattmakai.

Something wrong with this post? Fork this page's source on GitHub.

October 27, 2016 04:00 AM

How to Build Your First Slack Bot with Python

Bots are a useful way to interact with chat services such as Slack. If you have never built a bot before, this post provides an easy starter tutorial for combining the Slack API with Python to create your first bot.

We will walk through setting up your development environment, obtaining a Slack API bot token and coding our simple bot in Python.

Tools We Need

Our bot, which we will name "StarterBot", requires Python and the Slack API. To run our Python code we need:

It is also useful to have the Slack API docs handy while you're building this tutorial.

All the code for this tutorial is available open source under the MIT license in the slack-starterbot public repository.

Establishing Our Environment

We now know what tools we need for our project so let's get our development environment set up. Go to the terminal (or Command Prompt on Windows) and change into the directory where you want to store this project. Within that directory, create a new virtualenv to isolate our application dependencies from other Python projects.

virtualenv starterbot

Activate the virtualenv:

source starterbot/bin/activate

Your prompt should now look like the one in this screenshot.

Command prompt with starterbot's virtualenv activated.

The official slackclient API helper library built by Slack can send and receive messages from a Slack channel. Install the slackclient library with the pip command:

pip install slackclient

When pip is finished you should see output like this and you'll be back at the prompt.

Output from using the pip install slackclient command with a virtualenv activated.

We also need to obtain an access token for our Slack team so our bot can use it to connect to the Slack API.

Slack Real Time Messaging (RTM) API

Slack grants programmatic access to their messaging channels via a web API. Go to the Slack web API page and sign up to create your own Slack team. You can also sign into an existing account where you have administrative privileges.

Use the sign in button on the top right corner of the Slack API page.

After you have signed in go to the Bot Users page.

Custom bot users webpage.

Name your bot "starterbot" then click the “Add bot integration” button.

Add a bot integration named starterbot.

The page will reload and you will see a newly-generated access token. You can also change the logo to a custom design. For example, I gave this bot the Full Stack Python logo.

Copy and paste the access token for your new Slack bot.

Click the "Save Integration" button at the bottom of the page. Your bot is now ready to connect to Slack's API.

A common practice for Python developers is to export secret tokens as environment variables. Export the Slack token with the name SLACK_BOT_TOKEN:

export SLACK_BOT_TOKEN='your slack token pasted here'

Nice, now we are authorized to use the Slack API as a bot.

There is one more piece of information we need to build our bot: our bot's ID. Next we will write a short script to obtain that ID from the Slack API.

Obtaining Our Bot’s ID

It is finally time to write some Python code! We'll get warmed up by coding a short Python script to obtain StarterBot's ID. The ID varies based on the Slack team.

We need the ID because it allows our application to determine if messages parsed from the Slack RTM are directed at StarterBot. Our script also tests that our SLACK_BOT_TOKEN environment variable is properly set.

Create a new file named and fill it with the following code.

import os
from slackclient import SlackClient

BOT_NAME = 'starterbot'

slack_client = SlackClient(os.environ.get('SLACK_BOT_TOKEN'))

if __name__ == "__main__":
    api_call = slack_client.api_call("users.list")
    if api_call.get('ok'):
        # retrieve all users so we can find our bot
        users = api_call.get('members')
        for user in users:
            if 'name' in user and user.get('name') == BOT_NAME:
                print("Bot ID for '" + user['name'] + "' is " + user.get('id'))
        print("could not find bot user with the name " + BOT_NAME)

Our code imports the SlackClient and instantiates it with our SLACK_BOT_TOKEN, which we set as an environment variable. When the script is executed by the python command we call the Slack API to list all Slack users and get the ID for the one that matches the name "starterbot".

We only need to run this script once to obtain our bot’s ID.


The script prints a single line of output when it is run that provides us with our bot's ID.

Use the Python script to print the Slack bot's ID in your Slack team.

Copy the unique ID that your script prints out. Export the ID as an environment variable named BOT_ID.

(starterbot)$ export BOT_ID='bot id returned by script'

The script only needs to be run once to get the bot ID. We can now use that ID in our Python application that will run StarterBot.

Coding Our StarterBot

We've got everything we need to write the StarterBot code. Create a new file named and include the following code in it.

import os
import time
from slackclient import SlackClient

The os and SlackClient imports will look familiar because we used them in the program.

With our dependencies imported we can use them to obtain the environment variable values and then instantiate the Slack client.

# starterbot's ID as an environment variable
BOT_ID = os.environ.get("BOT_ID")

# constants
AT_BOT = "<@" + BOT_ID + ">"

# instantiate Slack & Twilio clients
slack_client = SlackClient(os.environ.get('SLACK_BOT_TOKEN'))

The code instantiates the SlackClient client with our SLACK_BOT_TOKEN exported as an environment variable.

if __name__ == "__main__":
    READ_WEBSOCKET_DELAY = 1 # 1 second delay between reading from firehose
    if slack_client.rtm_connect():
        print("StarterBot connected and running!")
        while True:
            command, channel = parse_slack_output(slack_client.rtm_read())
            if command and channel:
                handle_command(command, channel)
        print("Connection failed. Invalid Slack token or bot ID?")

The Slack client connects to the Slack RTM API WebSocket then constantly loops while parsing messages from the firehose. If any of those messages are directed at StarterBot, a function named handle_command determines what to do.

Next add two new functions to parse Slack output and handle commands.

def handle_command(command, channel):
        Receives commands directed at the bot and determines if they
        are valid commands. If so, then acts on the commands. If not,
        returns back what it needs for clarification.
    response = "Not sure what you mean. Use the *" + EXAMPLE_COMMAND + \
               "* command with numbers, delimited by spaces."
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    slack_client.api_call("chat.postMessage", channel=channel,
                          text=response, as_user=True)

def parse_slack_output(slack_rtm_output):
        The Slack Real Time Messaging API is an events firehose.
        this parsing function returns None unless a message is
        directed at the Bot, based on its ID.
    output_list = slack_rtm_output
    if output_list and len(output_list) > 0:
        for output in output_list:
            if output and 'text' in output and AT_BOT in output['text']:
                # return text after the @ mention, whitespace removed
                return output['text'].split(AT_BOT)[1].strip().lower(), \
    return None, None

The parse_slack_output function takes messages from Slack and determines if they are directed at our StarterBot. Messages that start with a direct command to our bot ID are then handled by our code - which is currently just posts a message back in the Slack channel telling the user to write some more Python code!

Here is how the entire program should look when it's all put together (you can also view the file in GitHub):

import os
import time
from slackclient import SlackClient

# starterbot's ID as an environment variable
BOT_ID = os.environ.get("BOT_ID")

# constants
AT_BOT = "<@" + BOT_ID + ">"

# instantiate Slack & Twilio clients
slack_client = SlackClient(os.environ.get('SLACK_BOT_TOKEN'))

def handle_command(command, channel):
        Receives commands directed at the bot and determines if they
        are valid commands. If so, then acts on the commands. If not,
        returns back what it needs for clarification.
    response = "Not sure what you mean. Use the *" + EXAMPLE_COMMAND + \
               "* command with numbers, delimited by spaces."
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    slack_client.api_call("chat.postMessage", channel=channel,
                          text=response, as_user=True)

def parse_slack_output(slack_rtm_output):
        The Slack Real Time Messaging API is an events firehose.
        this parsing function returns None unless a message is
        directed at the Bot, based on its ID.
    output_list = slack_rtm_output
    if output_list and len(output_list) > 0:
        for output in output_list:
            if output and 'text' in output and AT_BOT in output['text']:
                # return text after the @ mention, whitespace removed
                return output['text'].split(AT_BOT)[1].strip().lower(), \
    return None, None

if __name__ == "__main__":
    READ_WEBSOCKET_DELAY = 1 # 1 second delay between reading from firehose
    if slack_client.rtm_connect():
        print("StarterBot connected and running!")
        while True:
            command, channel = parse_slack_output(slack_client.rtm_read())
            if command and channel:
                handle_command(command, channel)
        print("Connection failed. Invalid Slack token or bot ID?")

Now that all of our code is in place we can run our StarterBot on the command line with the python command.

Console output when the StarterBot is running and connected to the API.

In Slack, create a new channel and invite StarterBot or invite it to an existing channel.

In the Slack user interface create a new channel and invite StarterBot.

Now start giving StarterBot commands in your channel.

Give StarterBot commands in your Slack channel.

As it is currently written above in this tutorial, the line AT_BOT = "<@" + BOT_ID + ">" does not require a colon after the "@starter" (or whatever you named your particular bot) mention. Previous versions of this tutorial did have a colon because Slack clients would auto-insert the ":" but that is no longer the case.

Wrapping Up

Alright, now you've got a simple StarterBot with a bunch of places in the code you can add whatever features you want to build.

There is a whole lot more that could be done using the Slack RTM API and Python. Check out these posts to learn what you could do:

Questions? Contact me via Twitter @fullstackpython or @mattmakai. I'm also on GitHub with the username mattmakai.

Something wrong with this post? Fork this page's source on GitHub.

October 27, 2016 04:00 AM

October 26, 2016

Zato Blog

Interesting real-world Single Sign-On with JWT, WebSockets and Zato


In a recent project, an interesting situation occurred that let JSON Web Tokens (JWT) and WebSockets, two newly added features of Zato middleware server, be nicely employed in practice with great results.

The starting point was the architecture as below.


Pretty common stuff - users authenticate with a frontend web-application serving pages to browsers and at the same time communicating with Zato middleware which provides a unified interface to further backend systems.


Now, the scenario started to look intriguing when at one point a business requirement meant that in technical terms it was decided that WebSocket connections be employed so that browsers could be swiftly notified of events taking place in backend systems.

WebSockets are straight-forward, come with all modern browsers, and recently Zato grew means to mount services on WebSocket channels - this in addition to REST, SOAP, ZeroMQ, AMQP, WebSphere MQ, scheduler, and all the other already existing channels.



The gotcha

However, when it came to implementation it turned out that the frontend web-application is incapable to act as a client of Zato services exposed via WebSockets.

That is, it could offer WebSockets to browsers but would not be able itself to establish long-running WebSocket connections to Zato - it had been simply designed to work in a strict request-reply fashion and WebSockets were out of its reach.

This meant that it was not possible for Zato to notify the frontend application of new events without the frontend constantly polling for them which beat the purpose of employing WebSockets in the first place.

Thus, seeing as browsers themselves support WebSockets very well, it was agreed that there is no choice but have each user browser connect to Zato directly and WebSocket channels in Zato would ensure that browsers receive notification as they happen in backend systems.

Browser authentication

Deciding that browsers connect directly to Zato posed a new challenge, however. Whereas previously users authenticated with the frontend that had its own application-level credentials in Zato, now browsers connecting directly to Zato would also have to authenticate.

Naturally, it was ruled out that suddenly users would be burdened with a new username/password to enter anywhere. At the same time it was not desirable to embed the credentials in HTML served to browsers because that would have to be done in clear text.

Instead, JWT was used by the frontend application to securely establish a session in Zato and transfer its ownership to a browser.



How JWT works in Zato

At their core, JWT (JSON Web Tokens) are essentially key-value mappings that declare that certain information is true. In the context of Zato authentication, when selected services Zato server are secured with JWT, the following happens:

In other words, JWT declares that a username/password combination was valid at a certain time and that this particular token was generated for that user and that it will be valid until it expires or is further prolonged.

This is very convenient and easy to understand. Another really great property of JWT is that they are extremely simple to use in JavaScript code running in browsers or elsewhere - they are just an opaque string that needs to be provided to Zato servers in a single header which is a trivial task.

Combining it all

Having all the pieces in one place meant the solution was simple - it was the frontend application that would call a custom endpoint in Zato to create a JWT for use in browsers. Since the token is safely encrypted, it can be passed around anywhere and Zato can return it to frontend without any worries.

Once the frontend returns the token to a browser, the browser can then go ahead and open a direct WebSocket connection secured with the newly generated token. Zato receives the token, decrypts it and confirms that it is valid and was in fact generated on server side as expected.

The net result is that browsers now have secure direct WebSocket connections to Zato yet no user credentials are relayed anywhere in clear text.


At the same time, users started to receive notifications from backend systems and everyone was excited even though initially the situation looked bleak when it turned out that the frontend couldn't itself become a WebSockets client.

October 26, 2016 06:07 PM

Mike Driscoll

Creating Graphs with Python and GooPyCharts

Over the summer, I came across an interesting plotting library called GooPyCharts which is a Python wrapper for the Google Charts API. In this article, we will spend a few minutes learning how to use this interesting package. GooPyCharts follows syntax that is similar to MATLAB and is actually meant to be an alternative to matplotlib.

To install GooPyCharts, all you need to do is use pip like this:

pip install gpcharts

Now that we have it installed, we can give it a whirl!

Our First Graph

Using GooPyCharts to create a chart or graph is extremely easy. In fact, you can create a simple graph in 3 lines of code:

>>> from gpcharts import figure
>>> my_plot = figure(title='Demo')
>>> my_plot.plot([1, 2, 10, 15, 12, 23])

If you run this code, you should see your default browser pop open with the following image displayed:


You will note that you can download the figure as a PNG or save the data that made the chart as a CSV file. GooPyCharts also integrates with the Jupyter Notebook.

Creating a Bar Graph

The GooPyCharts package has a nice script included to help you learn how to use the package. Unfortunately it doesn’t actually demonstrate different types of charts. So I took one of the examples from there and modified it to create a bar chart:

from gpcharts import figure
fig3 = figure()
xVals = ['Temps','2016-03-20','2016-03-21','2016-03-25','2016-04-01']
yVals = [['Shakuras','Korhal','Aiur'],[10,30,40],[12,28,41],[15,34,38],[8,33,47]]
fig3.title = 'Weather over Days'
fig3.ylabel = 'Dates', yVals)

You will note that in this example we create our title using the figure instance’s title property. We also set the ylabel the same way. You can also see how to define dates for the chart as well as set an automatic legend using nested lists. Finally you can see that instead of calling plot we need to call bar to generate a bar chart. Here is the result:


Creating Other Types of Graphs

Let’s modify the code a bit more and see if we can create other types of graphs. We will start with a scatter plot:

from gpcharts import figure
my_fig = figure()
xVals = ['Dates','2016-03-20','2016-03-21','2016-03-25','2016-04-01']
yVals = [['Shakuras','Korhal','Aiur'],[10,30,40],[12,28,41],[15,34,38],[8,33,47]]
my_fig.title = 'Scatter Plot'
my_fig.ylabel = 'Temps'
my_fig.scatter(xVals, yVals)

Here we can most of the same data that we used in the last example. We just need to modify a few values to make the X and Y labels work correctly and we need to title the graph with something that makes sense. When you run this code, you should see something like this:


That was pretty simple. Let’s try creating a quick and dirty histogram:

from gpcharts import figure
my_fig = figure()
my_fig.title = 'Random Histrogram'
my_fig.xlabel = 'Random Values'
vals = [10, 40, 30, 50, 80, 100, 65]

The histogram is much simpler than the last two charts we created as it only needs one list of values to create it successfully. This is what I got when I ran the code:


This is a pretty boring looking histogram, but it’s extremely easy to modify it and add a more realistic set of data.

Wrapping Up

While this was just a quick run-through of some of GooPyCharts capabilities, I think we got a pretty good idea of what this charting package is capable of. It’s really easy to use, but only has a small set of charts to work with. PyGal, Bokeh and matplotlib have many other types of charts that they can create. However if you are looking for something that’s super easy to install and use and you don’t mind the small set of charts it supports, then GooPyCharts may be just the right package for you!

Related Reading

October 26, 2016 05:15 PM

Zaki Akhmad

Fail Running Test Code

I was recalling, how did I run the test code. Until I found that I had written the snippet. I copy-paste it… and the test code was fail. Something wrong. No changes in the test code since my last commit.

I tried the other test code. Looks OK. I tried to rename the fail test code filename, still no good. So, what’s the problem?

Until I found this stackoverflow question.

So, I tried to import the modules written in the test code via shell. Finally now I know which part of the code that fails to import the library.

October 26, 2016 04:47 PM

Continuum Analytics News

Recursion Pharmaceuticals Wants to Cure Rare Genetic Diseases - and We’re Going to Help

Wednesday, October 26, 2016
Michele Chambers
EVP Anaconda Business Unit & CMO
Continuum Analytics

Today we are pleased to announce that Continuum Analytics and Recursion Pharmaceuticals are teaming up to use data science in the quest to find cures for rare genetic diseases. Using Bokeh on Anaconda, Recursion is building its drug discovery assay platform to analyze layered cell images and weigh the effectiveness of different remedies. As we always say, Anaconda arms data scientists with superpowers to change the world. This is especially valuable for Recursion, since success literally means saving lives and changing the world by bringing drug remedies for rare genetic diseases to market faster than ever before.
It’s estimated that there are over 6,000 genetic disorders, yet many of these diseases represent a small market. Pharmaceutical companies aren’t usually equipped to pursue the cure for each disease. Anaconda will help Recursion by blending biology, bioinformatics and machine learning, bringing cell data to life. By identifying patterns and assessing drug remedies quickly, Recursion is using data science to discover potential drug remedies for rare genetic diseases. In English - this company is trying to cure big, bad, killer diseases using Open Data Science. 

The ODS community is important to us. Working with a company in the pharmaceutical industry, an industry that is poised to convert ideas into life-saving medications, is humbling. With so many challenges, not the least of which include regulatory roadblocks and lengthy and complex R&D processes, researchers must continually adapt and innovate to speed medical advances. Playing a part in that process? That’s why we do what we do. We’re excited to welcome Recursion to the family and observe as it uses its newfound superpowers to change the world, one remedy at a time.

Want to learn more about this news? Check out the press release, here

October 26, 2016 02:17 PM

Recursion Pharmaceuticals Selects Anaconda to Create Innovative Next Generation Drug Discovery Assay Platform to Eradicate Rare Genetic Diseases

Wednesday, October 26, 2016

Open Data Science Platform Accelerates Time-to-Market for Drug Remedies

AUSTIN, TX—October 26, 2016—Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, today announced that Recursion Pharmaceuticals, LLC, a drug discovery company focused on rare genetic diseases, has adopted Bokeh––a Continuum Analytics open source visualization framework that operates on the Anaconda platform. Bokeh on Anaconda makes it easy for biologists to identify genetic disease markers and assess drug efficacy when visualizing cell data, allowing for faster time-to-value for pharmaceutical companies. 

“Bokeh on Anaconda enables us to perform analyses and make informative, actionable decisions that are driving real change in the treatment of rare genetic diseases,” said Blake Borgeson, CTO & co-founder at Recursion Pharmaceuticals. “By layering information and viewing images interactively, we are obtaining insights that were not previously possible and enabling our biologists to more quickly assess the efficacy of drugs. With the power of Open Data Science, we are one step closer to a world where genetic diseases are more effectively managed and more frequently cured, changing patient lives forever.” 

By combining interactive, layered visualizations in Bokeh on Anaconda to show both healthy and diseased cells along with relevant data, biologists can experiment with thousands of potential drug remedies and immediately understand the effectiveness of the drug to remediate the genetic disease. Biologists realize faster insights, speeding up time-to-market for potential drug treatments. 

“Recursion Pharmaceuticals’ data scientists crunch huge amounts of data to lay the foundation for some of the most advanced genetic research in the marketplace. With Anaconda, the Recursion data science team has created a breakthrough solution that allows biologists to quickly and cost effectively identify therapeutic treatments for rare genetic diseases,” said Peter Wang, CTO & co-founder at Continuum Analytics. “We are enabling companies like Recursion to harness the power of data on their terms, building solutions for both customized and universal insights that drive new value in all areas of business and science. Anaconda gives superpowers to people who change the world––and Recursion is a great example of how our Open Data Science vision is being realized and bringing solid, everyday value to critical healthcare processes.”

Data scientists at Recursion evaluate hundreds of genetic diseases, ranging from one evaluation per month to thousands in the same time frame. Bokeh on Anaconda delivers insights derived from heat maps, charts, plots and other scientific visualizations interactively and intuitively, while providing holistic data to enrich the context and allow biologists to discover potential treatments quickly. These visualizations empower the team with new ways to re-evaluate shelved pharmaceutical treatments and identify new potential uses for them. Ultimately, this creates new markets for pharmaceutical investments and helps develop new treatments for people suffering from genetic diseases. 

Bokeh on Anaconda is a framework for creating versatile, interactive and browser-based visualizations of streaming data or Big Data from Python, R or Scala without writing any JavaScript. It allows for exploration, embedded visualization apps and interactive dashboards, so that users can create rich, contextual plots, graphs, charts and more to enable more comprehensive deductions from images. 

For additional information about Continuum Analytics and Anaconda please visit: For more information on Bokeh on Anaconda visit

About Recursion Pharmaceuticals, LLC

Founded in 2013, Salt Lake City, Utah-based Recursion Pharmaceuticals, LLC is a drug discovery company. Recursion uses a novel drug screening platform to efficiently repurpose and reposition drugs to treat rare genetic diseases. Recursion’s novel drug screening platform combines experimental biology and bioinformatics in a massively parallel system to quickly and efficiently identify treatments for multiple rare genetic diseases. The core of the approach revolves around high-throughput automated screening using images of human cells, which allows the near simultaneous modeling of hundreds of genetic diseases. Rich data from these assays is probed using advanced statistical and machine learning approaches, and the effects of thousands of known drugs and shelved drug candidates can be investigated efficiently to identify those holding the most promise for the treatment of any one rare genetic disease.

The company’s lead candidate, a new treatment for Cerebral Cavernous Malformation, is approaching clinical trials, and the company has a rich pipeline of repurposed therapies in its development pipeline for diverse genetic diseases.

About Anaconda Powered by Continuum Analytics

Continuum Analytics is the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python. We put superpowers into the hands of people who are changing the world. 

With more than 3M downloads and growing, Anaconda is trusted by the world’s leading businesses across industries––financial services, government, health & life sciences, technology, retail & CPG, oil & gas––to solve the world’s most challenging problems. Anaconda does this by helping everyone in the data science team discover, analyze and collaborate by connecting their curiosity and experience with data. With Anaconda, teams manage their Open Data Science environments without any hassles to harness the power of the latest open source analytic and technology innovations. 

Continuum Analytics' founders and developers have created and contributed to some of the most popular Open Data Science technologies, including NumPy, SciPy, Matplotlib, Pandas, Jupyter/IPython, Bokeh, Numba and many others. Continuum Analytics is venture-backed by General Catalyst and BuildGroup. 

To learn more, visit


Media Contact:
Jill Rosenthal

October 26, 2016 12:01 PM

A. Jesse Jiryu Davis

Announcing Motor 0.7

Motor logo by Musho Rodney Alan Greenblat

Three weeks after I released the beta, I’m proud to present Motor 0.7.

For asynchronous I/O Motor now uses a thread pool, which is faster and simpler than the prior implementation with greenlets. It no longer requires the greenlet package, and now requires the futures backport package on Python 2. Read the beta announcement to learn more about the switch from greenlets to threads.

Install with:

python -m pip install motor

This version updates the PyMongo dependency from 2.8.0 to 2.9.x, and wraps PyMongo 2.9’s new APIs.

Since the beta release, I’ve fixed one fun bug, a manifestation in Motor of the same import deadlock I fixed in PyMongo, Tornado, and Gevent last year.

The next release will be Motor 1.0, which will be out in less than a month. Most of Motor 1.0’s API is now implemented in Motor 0.7, and APIs that will be removed in Motor 1.0 are now deprecated and raise warnings.

This is a large release, please read the documentation carefully:

If you encounter any issues, please file them in Jira.

—A. Jesse Jiryu Davis

October 26, 2016 07:16 AM

Kushal Das

Science Hack Day India 2016

Few months back Praveen called to tell me about the new event he is organizing along with FOSSASIA, Science Hack Day, India. I never even registered for the event as Praveen told me that he just added mine + Anwesha’s name there. Sadly as Py was sick for the last few weeks, Anwesha could not join us in the event. On 20th Hong Phuc came down to Pune, in the evening we had the PyLadies meetup in the Red Hat office.

On 21st early morning we started our journey. Sayan, Praveen Kumar, and Pooja joined us in my car. This was my longest driving till date (bought the car around a year back). As everyone suggested, the road in Karnataka was smooth. I am now waiting for my next chance to drive on that road. After reaching Belgaum we decided to follow Google maps, which turned out to be a very bad decision. As the maps took us to a dead end with a blue gate. Later we found many localities also followed Google maps, and reached the same dead end.

The location of the event was Sankalp Bhumi, a very well maintained resort, full with greenery, and nature. We stayed in the room just beside the lake. Later at night Saptak joined us. Siddesh, Nisha + Ira also reached later in the evening.

Day 1

We had a quick inauguration event, all mentors talked about the project they will be working on, and then we moved towards the area for us. The main hall was slowly filled with school kids who had a build your own solar light workshop (lead by Jithin). Pooja also joined the workshop to help the kids with soldering.

I managed to grab the largest table in the hack area. Around 9 people joined me, among them we had college students, college professors, and someone came in saying she is from different background than computers. I asked her to try this Python thing, and by the end of the day she was also totally hooked into learning. I later found her daughter was also participating in the kids section. Before lunch we went through the basics of Python as a programming language. All of the participants had Windows laptops, so it was fun to learn various small things about Windows. But we managed to get going well.

Later we started working on MicroPython. We went ahead step by step, first turn on LED, later to DHT11 sensors for temperature+humidity measurements. By late afternoon all of us managed to write code to read the measurements from the sensors. I had some trouble with the old firmware I was using, but using the latest nightly firmware helped to fix the issue related to MQTT. I kept one of the board running for the whole night, Sayan wrote the client to gather the information from the Mosquitto server.

In the evening we also had lighting talks, I gave a talk there about dgplug summer training. The last talk in the evening was from Prof. Pravu, and during that talk I saw someone started a powerful gas stove outside the hut. I was then totally surprised to learn that the fire was not from gas, but using water and some innovative design his team managed to make a small stove which is having 35% efficiency of any biomass, the fire was blue, and no smoke. This was super cool.

After dinner, there was a special live show of laser lights+sound work done by Praveen. Teachers are important part of our lives. When we see someone like Praveen, who is taking the learning experience to another level while being in one of the small town in India, that gives a lot of pride to us. Btw, if you are wondering, he uses Python for most of his experiments :)

Day 2

I managed to move to the hack area early morning, and kept the setup ready. My team joined me after breakfast. They decided to keep one of the boards under the Sun beside the lake, and see the difference of temperature between two devices. I also met two high school teachers from a village near Maharashtra/Karnataka border. They invited us for more workshops in their school. They also brought a few rockets, which we launched from the venue :)

During afternoon Sayan, and Saptak worked on the web-frontend for our application, the following image shows the temperature, and humidity values from last night. The humidity during night was 70%, but during day it was around 30%. Temperature stayed between 20-30C.

Beside our table Nisha was working on her Cookie project. Near the dinning area, Arun, and his group created an amazing map of the resort in the ground using various organic materials available in the location. That project own the best science hack of the event. You can find more about various other details in the etherpad.

The impact of the event

We saw school kids crying as they don’t want to go back from the event. Every participant was full with energy. We had people working with ideas on all kinds things, and they came in from all different background. Siddhesh mentioned this event as the best event in India he has even been to. Begalum as a city joined in to make this event a successful one, we found local businesses supporting by sponsoring, local news papers covered the event. The owner of the venue also helped in various ways. By the end of the day 2, everyone of us were asking about when can we get back for next year’s event. I should specially thank the organising team (Hitesh, Rahul, and all of the volunteers) to make this event such a success. I also want to thank Hong Phuc Dang and FOSSASIA for all the help.

October 26, 2016 07:05 AM

Vasudev Ram

Read from CSV with D, write to PDF with Python

By Vasudev Ram


Here is another in my series of applications of xtopdf, my PDF creation toolkit for Python (xtopdf source here).

This xtopdf application is actually a pipeline (nothing Unix-specific though, will work on both *nix and Windows) - a D program reading CSV data and sending it to a Python program, which writes the data to PDF.

The D program, read_csv.d, reads CSV data from a .csv file, and writes it to standard output.

The Python program, (which is part of the xtopdf toolkit), reads its standard input (which is redirected by the pipeline to come from the D program's standard output) and writes the data it reads, to PDF.

Here is the D program, read_csv.d:

File: read_csv.d
Purpose: A program to read CSV data from a file and
write it to standard output.
Author: Vasudev Ram
Date created: 2016-10-25
Copyright 2016 Vasudev Ram
Web site:
Product store:

import std.algorithm;
import std.array;
import std.csv;
import std.stdio;
import std.file;
import std.typecons;

int main()
try {
stderr.writeln("Reading CSV data from file.");
auto file = File("input.csv", "r");
foreach (record;
file.byLine.joiner("\n").csvReader!(Tuple!(string, string, int)))
writefln("%s works as a %s and earns $%d per year",
record[0], record[1], record[2]);
} catch (CSVException csve) {
stderr.writeln("Caught CSVException: msg = ", csve.msg,
" at row, col = ", csve.row, ", ", csve.col);
} catch (FileException fe) {
stderr.writeln("Caught FileException: msg = ", fe.msg);
} catch (Exception e) {
stderr.writeln("Caught Exception: msg = ", e.msg);
return 0;
The D program is compiled as usual with:
dmd read_csv.d
I ran it first (only the D program) with an invalid CSV file (it has an extra comma at the start on line 3, which invalidates the data by making "Driver" be in the salary column position), and got the expected error message, which includes the row and column number of the place in the CSV file where the program encountered the error - this is useful for fixing the input data:
$ type input.csv
$ read_csv
Reading CSV data from file.
Jack works as a Carpenter and earns $40000 per year
Tom works as a Blacksmith and earns $50000 per year
Caught CSVException: msg = Unexpected 'D' when converting from type string to type int
at row, col = 3, 3
Then I ran it again, in the regular way, this time with a valid CSV file, and as part of a pipeline, the other pipeline component being StdinToPDF:
$ read_csv | python csv_output.pdf
Reading CSV data from file.
And here is a cropped view of the output as seen in Foxit PDF Reader:

- Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python   DLang   xtopdf

Subscribe to my blog by email

My ActiveState recipes

FlyWheel - Managed WordPress Hosting

October 26, 2016 03:37 AM

Import Python

ImportPython Issue 95

Worthy Read

Excellent post from Armin Ronacher on tackling a CPython performance bottleneck with a custom Rust extension module.

core python
There are different way to prevent setting attributes and make attributes read only on object in python. We can use any one of the following way to make attributes readonly. 1) Property Descriptor 2) Using descriptor methods __get__ and __set__ 3) Using slots (only restricts setting arbitary attributes).

The API for file uploads. Integrate Filestack in 2 lines of code. Python library for Filestack

In this tutorial we will be deploying ,a empty Django project I created to illustrate the deployment process.

Scraping is often an example of code that is embarrassingly parallel. With some slight changes, our tasks can be done asynchronously, allowing us to process more than one URL at a time. In version 3.2, Python introduced the concurrent.futures module, which is a joy to use for parallelizing tasks like scraping. The rest of this post will show how we can use the module to make our previously synchronous code asynchronous.

Most Django programmers use function-based views, but some use class-based views. Why? Special guest Buddy Lindsey will be joining us this week to talk about how class-based views are different.

I'm excited to introduce you to Markus Siemens and TinyDb. This is a 100% pure python, embeddable, pip-installable document DB for Python.

finite state machine
Whether you're building up a CMS or a bespoke application, chances are that you will have to handle some states / statuses. Let's discuss your options in Django.

IT Help Desk & Ticketing. Start a free trial of JIRA Service Desk and get your free Konami Code shirt.

General Guidelines when upgrading Django.

Benoit writes about debugging his software using gdb, python-debuginfo.

Check the tweet :)

opensource project
lptrace is strace for Python programs. It lets you see in real-time what functions a Python program is running. It's particularly useful to debug weird issues on production.

In this post, I’ll explain how mypy works, the benefits and pain points we’ve seen in using mypy, and share a detailed guide for adopting mypy in a large production codebase (including how to find and fix dozens of issues in a large project in the first few days of using mypy!).

web server
Python 3.5+ web server that's written to go fast

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.


Hoxton, City of London, London, United Kingdom
Patch are hiring a CTO / Lead developer. We are expanding our tech team as part of scaling the company. This is an opportunity to make a big impact on our E- commerce platform and help shape the new services we’re creating.

Upcoming Conference / User Group Meet

Projects - 17 Stars, 6 Fork
Feed-forward neural network for real-time artistic style transfer. Curator's Note - This is a pretty cool project.

TextSum - 8 Stars, 1 Fork
Preparing a dataset for TensorFlow text summarization (TextSum) model.

countrynames - 5 Stars, 0 Fork
Utility library to turn country names into ISO two-letter codes.

celery-redundant-scheduler - 4 Stars, 0 Fork
Celery beat sheduler provides ability to run multiple celerybeat instances.

SlackUptimeMonitor - 3 Stars, 3 Fork
Receive notifications in Slack when your websites/api/services are down

confluence-dumper - 3 Stars, 0 Fork
Tool to export Confluence spaces and pages recursively via its API

asyncio-nats-streaming - 3 Stars, 0 Fork
A asyncio library for NATS Streaming.

October 26, 2016 01:39 AM

ImportPython Issue 94

Worthy Read

In this review I’ll explain how Djaneiro can make your Django development workflow more productive and I’ll go over the pros and cons of the plugin as I experienced them. After that I’ll take a look at alternatives to Djaneiro in the Sublime Text plugin landscape. At the end I’ll share my final verdict and ratings.

Talk by Steven F. Lott.

Write Python code and see how the ast looks like in the browser right now. No installation needed.

This project tries to provide a lot of piece of Python code that makes life easier.

pandasql, a Python package we (Yhat) wrote that emulates the R package sqldf. It's a small but mighty library comprised of just 358 lines of code. The idea of pandasql is to make Python speak SQL. For those of you who come from a SQL-first background or still "think in SQL", pandasql is a nice way to take advantage of the strengths of both languages.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

BinaryTree is a minimal Python library which provides you with a simple API to generate, visualize and inspect binary trees so you can skip the tedious work of mocking up test trees, and dive right into practising your algorithms! Heaps and BSTs (binary search trees) are also supported.

patat (Presentations And The ANSI Terminal) is a small tool that allows you to show presentations using only an ANSI terminal. It does not require ncurses.

PyCon 2017 ( US ) site is live. Note - Registration starts on Oct 17th. If you are looking to speak/attend reach out dates for talk/tutorial/paper aka Call For Proposals ( CFP ) submission.

In this post we shall explore the different ways we can achieve concurrency and the benefits/drawbacks of them. With the advent of Python 3 the way we’re hearing a lot of buzz about “async” and “concurrency”, one might simply assume that Python recently introduced these concepts/capabilities. But that would be quite far from the truth. We have had async and concurrent operations for quite some times now. Also many beginners may think that asyncio is the only/best way to do async/concurrent operations.

Functional programming is a discipline, not a language feature. It is supported by a wide variety of languages, although those languages can make it more or less difficult to practice the discipline. Python has a number of features that support functional programming, including map/reduce functions, partial application, and decorators.

Jake VanderPlas explains Python’s essential syntax and semantics, built-in data types and structures, function definitions, control flow statements, and more, using Python 3 syntax.

The Python Imaging Library or PIL allowed you to do image processing in Python. Here is a tutorial.

image processing
For the past couple of years, I’ve been writing automated tests for my employer. One of the many types of tests that I do is comparing how an application draws. Does it draw the same way every single time? If not, then we have a serious problem. An easy way to check that it draws the same each time is to take a screenshot and then compare it to future versions of the same drawing when the application gets updated.

This recipe shows how to get the names and types of all the attributes of a Python module. This can be useful when exploring new modules (either built-in or third-party), because attributes are mostly a) data elements or b) functions or methods, and for either of those, you would like to know the type of the attribute, so that, if it is a data element, you can print it, and if it is a function or method, you can print its docstring to get brief help on its arguments, processsing and outputs or return values, as a way of learning how to use it.

Upcoming Conference / User Group Meet


RocketFuelPython - 8 Stars, 0 Fork
A Python implementation of RocketFuel topology mapping engine. - 7 Stars, 0 Fork
Make TLS/SSL security mass scans with and import results into ElasticSearch. Script collection for generating command lines that can be executed sequentially or in parallel with tools like GNU Parallel and importing the results into a structured document in ElasticSearch for further analysis.

RacketCallGraph - 7 Stars, 2 Fork
A simple Python script that generate Call Graph of simple Racket program by generating dot language scripts. It uses naive approach that basically traverse the program and maintain a state machine regardless of context. Currently it only maintain a FSM so advance features of Racket, like lambda-function is not support, will improve if needed in the future.

Gender Classification Challenge for 'Learn Python for Data Science #1'. This is the code for the gender classification challenge for 'Learn Python for Data Science #1' by @Sirajology on YouTube. The code uses the scikit-learn machine learning library to train a decision tree on a small dataset of body metrics (height, width, and shoe size) labeled male or female. Then we can predict the gender of someone given a novel set of body metrics.

tweets from the the second presidential debate. This repo contains data on roughly 150,000 debate tweets. However, to make the data compliant with Twitter's terms of service, the public data only contains tweet IDs. A short python script to convert that list of tweet IDs into the full twitter data is coming soon.

TrickleDownML - 4 Stars, 0 Fork
Start a conversation with Ronald Reagan!. I made a chatbot that mimics Ronald Reagan.

flask_church - 3 Stars, 0 Fork
An extension for Flask that help you generate fake data. Flask-Church is a small wrapper for Church library.

October 26, 2016 01:39 AM

ImportPython Issue 93

Worthy Read

We have been sharing Daniel's articles and videos from this youtube channel for a while now. Daniel Bader just published his book on Sublime Text for Python Developers. Have a look at his book if you are a sublime text user. Here is a 30% discount for all ImportPython Subscribers.

You can find lots of reasons to never delete records from your database. The Soft Delete pattern is one of the available options to implement deletions without actually deleting the data. It does it by adding an extra column to your database table(s) that keeps track of the deleted state of each of its rows. This sounds straightforward to implement, and strictly speaking it is, but the complications that derive from the use of soft deletes are far from trivial. In this article I will discuss some of these issues and how I avoid them in Flask and SQLAlchemy based applications.

data visualization
Comprehensive listing of all data visualization packages with small codesnippets.

Guilherme Caminha explores the utility of using on_commit hook available from 1.9 onwards in sequencing part of a time consuming task in django view and rest offloaded to an async process.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

Lukasz Langa uses asyncio source code to explain the event loop, blocking calls, coroutines, tasks, futures, thread pool executors, and process pool executors.

opensource project
Click is my go to Python package for creating command line applications. click-man will generate one man page per command of your click CLI application specified in console_scripts in your

Bryan is a core developer of the Bokeh project, which is a visualization package for Python. He has also helped with the development of Anaconda.

Flashlight enables you to easily solve for minimum snap trajectories that go through a sequence of waypoints, compute the required control forces along trajectories, execute the trajectories in a physics simulator, and visualize the simulation results.

opensource project
Church is a library to generate fake data. It's very useful when you need to bootstrap your database.

Raspberry and Python projects/scripts.

A simple,fast,extensible python library for data validation.

Upcoming Conference / User Group Meet


tf-agent - 27 Stars, 1 Fork
tensorflow reinforcement learning agents for OpenAI gym environments

become - 5 Stars, 0 Fork
Make one object become another.

python-line-api - 4 Stars, 0 Fork
SDK of the LINE Messaging API for Python.

football-stats - 2 Stars, 0 Fork
Football stats is a system which has the purpose of helping football match analyses. The final goal of the project is to have the capability of ball and players' position analysis, creating heatmaps and statistics of different actions or situations.

pytocli - 2 Stars, 0 Fork
A Python lib to generate CLI commands

xfce4-system-monitor - 1 Stars, 0 Fork
An xfce panel plugin to display the necessary information of the system.

October 26, 2016 01:39 AM

ImportPython Issue 92 - django-perf-rec track django performance, Mock testing, python alias more

Worthy Read

Over the years, I’ve come up with my own Python aliases that play nice with virtual environments. For this post, I tried to stay as generic as possible such that any alias here can be used by every Pythonista.

"Keep detailed records of the performance of your Django code.". django-perf-rec is like Django's assertNumQueries on steroids. It lets you track the individual queries and cache operations that occur in your code. This blog post explains the workings of this project .

machine learning
Last weekend I had the pleasure of introducing Machine Learning for Engineers (a practical walk-through, no maths) at PyConUK 2016 ( Video link on page ). My talk covered a practical guide to a 2 class classification challenge (Kaggle’s Titanic) with scikit-learn, backed by a longer Jupyter Notebook (github) and further backed by Ezzeri’s 2 hour tutorial from PyConUK 2014.

This tutorial will help you understand why mocking is important, and show you how to mock in Python with Mock and Pytest monkeypatch.

Yet another introduction to Django Channels. This one is a lot more clear and step by step tutorial. If you still don't know what Django channels is / how to get started, read this.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

In this series of posts I am going to review the Python mock library and exemplify its use. I will not cover everything you may do with mock, obviously, but hopefully I'll give you the information you need to start using this powerful library. Note it's a two part series as of now, here is the second part's url

Decorators are one of those features in Python that people like to talk about. Why? Because they're different. Because they're a little weird. Because they're a little mind-bending. Let's talk about decorators: how do you make them and when should you use them?

The Plotly V2 API suite is a simple alternative to the Google Charts API. Make a request to a Plotly URL and get a link to a dataset or D3.js chart. Python code snippet are included on the page.

code review
Daniel is doing a series of code review sessions with Python developers. Have a look at the accompanied video where he gives his opinion on a open source project by Milton.

c binding
CPython, the primary implementation of Python used by millions, is written in C. Python core developers embraced and exposed Python’s strong C roots, taking a traditional tack on portability, contrasting with the “write once, debug everywhere” approach popularized elsewhere. The community followed suit with the core developers, developing several methods for linking to C. This has given us a lot of choices for interfacing with c, let us look at them.

General rules to use mixins to compose your own view classes with code examples.

In this short article Mike shows us how to set auto complete for / arguments. Specially helpful if you have tons of management commands.

core python
That’s the opening paragraph from the Python Insider blog post discussing the 2016 Python core sprint that recently took place. In the case of Microsoft’s participation in the sprint, both Steve Dower and I (Brett Cannon) were invited to participate (which meant Microsoft had one of the largest company representations at the sprint). Between the two of us we spent the week completing work on four of our own PEPs for Python 3.6: Adding a file system path protocol (PEP 519), Adding a frame evaluation API to CPython (PEP 523), Change Windows console encoding to UTF-8 (PEP 528), Change Windows filesystem encoding to UTF-8 (PEP 529).

This is an unofficial fork of Django, which focuses entirely on backporting official, publicly-announced security fixes to Django 1.6.11. It does not contain any other bug fixes or features, and any branches other than security-backports/1.6.x are unlikely to be up-to-date.

Upcoming Conference / User Group Meet


fmap - 6 Stars, 0 Fork - a single dispatch version of fmap for Python3. While there are multiple Haskellesque 'lets put monads in Python!' style libraries out there, most don't seem to focus on taking the nice bits of Haskell's functional approach and giving them a nice Pythonic interface. is a very simple take on fmap that lets you remove some unnecesary boiler plate when you are applying a function to each element of a collection. I hope you like it!

fbtftp - 5 Stars, 0 Fork
fbtftp is Facebook's implementation of a dynamic TFTP server framework. It lets you create custom TFTP servers and wrap your own logic into it in a very simple manner. Facebook currently uses it in production, and it's deployed at global scale across all of our data centers.

unfurl - 4 Stars, 0 Fork
Python utility to move items in a directory tree to the topmost level possible

chalk - 2 Stars, 1 Fork
Simple, easy to learn interpreted programming language.

human-to-geojson - 2 Stars, 1 Fork
Convert raw Human exports to geoJSON

October 26, 2016 01:39 AM

ImportPython Issue 91 - asynq from quora, python packaging ecosystem and more

Worthy Read

Hey guys, this is Ankur. Curator behind ImportPython. Will be attending PyconIndia. Happy to meet you all and discuss all things Python. Get your opinion on the newsletter, How to make it better ?. Ping me on ankur at outlook dot com or just reply to this email. I will respond back. See you there.

asynq is a library for asynchronous programming in Python with a focus on batching requests to external services. It also provides seamless interoperability with synchronous code, support for asynchronous context managers, and tools to make writing and testing asynchronous code easier. asynq was developed at Quora and is a core component of Quora's architecture. See the original blog post here.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

There have been a few recent articles reflecting on the current status of the Python packaging ecosystem from an end user perspective, so it seems worthwhile for me to write-up my perspective as one of the lead architects for that ecosystem on how I characterise the overall problem space of software publication and distribution, where I think we are at the moment, and where I'd like to see us go in the future.

This team is responsible for supplying a variety of web apps built on a modern stack (mostly Celery, Django, nginx and Redis), but have almost no control over the infrastructure on which it runs, and boy, is some of that infrastructure old and stinky. We have no root access to these servers, most software configuration requires a ticket with a lead time of 48 hours plus, and the watchful eyes of a crusty old administrator and obtuse change management process. The machines are so old that many are still running on real hardware, and those that are VMs still run some ancient variety of Red Hat Linux, with, if we’re lucky, Python 2.4 installed.

The notebook functionality of Python provides a really amazing way of analyzing data and writing reports in one place. However in the standard configuration, the pdf export of the Python notebook is somewhat ugly and unpractical. In the following I will present my choices to create almost publication ready reports from within IPython/Jupyter notebook.

Paul Bailey, "A Guide to Bad Programming", at PyBay2016 was my fav talk amongst all. Check out the youtube channel.

image processing
I wrote a program to clean up scans of handwritten notes while simultaneously reducing file size. Some of my classes don’t have an assigned textbook. For these, I like to appoint weekly “student scribes” to share their lecture notes with the rest of the class, so that there’s some kind written resource for students to double-check their understanding of the material. The notes get posted to a course website as PDFs.

image processing
This tutorial will show you how to transform an image with different filters and techniques to deliver different outputs. These methods are still in use and part of a process known as Computer-To-Plate (CTP), used to create a direct output from an image file to a photographic film or plate (depending on the process). Note - It's a pretty good article that makes uses of Python 3, Pillow and is well written.

This is a Weekly Python Chat live video chat events. These events are hosted by Trey Hunner. This week Melanie Crutchfield and he are going to chat about things you'll wish you knew earlier when making your first website with Django. Much watch for newbies building websites in Django.

If you are looking to implement 2 Factor Authentication as part of your product and don't know where to start read this.

Upcoming Conference / User Group Meet


streamlink - 59 Stars, 10 Fork
CLI for extracting streams from various websites to video player of your choosing - 40 Stars, 6 Fork
Generate new lyrics in the style of any artist using LSTMs and TensorFlow

October 26, 2016 01:39 AM

ImportPython Issue 90 - Real-time streaming data pipeline, generators, channels, and more

Worthy Read

Motorway is a real-time data pipeline, much like Apache Storm - but made in Python :-) We use it over at Plecto and we're really happy with it - but we're continously developing it. The reason why we started this project was that we wanted something similar to Storm, but without Zookeeper and the need to take the pipeline down to update the topology.

This tutorial tries to teach event driven programming by making use of streaming API offered by twitter.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

In this guide we 'll cover generators in depth . We 'll talk about how and why to use them , the difference between generator functions and regular functions , the yield keyword , and provide plenty of examples.This guide assumes you have a basic knowledge of Python ( especially regular functions).Throughout this guide we are going to work towards solving a problem .

Python 3.6.0b1 is the first of four planned beta releases of Python 3.6, the next major release of Python, and marks the end of the feature development phase for 3.6. There are quite many new features have a look.

The Django team is pleased to announce that the Channels project is now officially part of the Django project, under our new Official Projects program. Channels is the effort to bring WebSockets, long-poll HTTP, and other non-request-response protocol and business logic handling to Django, as part of our ongoing effort to establish what makes a useful web framework in 2016.

Django makes unit & functional testing easy (especially with WebTest). Tests on routing, permissions, database updates and emails are all straightforward to implement but how do you test dates & time? You might for example want to test regular email notifications.

The idea behind Channels is quite simple. To understand the concept, let’s first walk through an example scenario, let’s see how Channels would process a request.

Open source has proven its value in many ways over the years. In many companies that value is purely in terms of consuming available projects and platforms. In this episode Zalando describes their recent move to creating and releasing a number of their internal projects as open source and how that has benefited their business. We also discussed how they are leveraging Python and a couple of the libraries that they have published.

To win your copy of this book, all you need to do is come up with a comment below highlighting the reason “why you would like to win this book”. Try your luck guys :)

machine learning
Only people with masters degrees or Ph.D’s work with machine learning professionally isn't true. The truth is you don’t need much maths to get started with machine learning, and you don’t need a degree to use it professionally. Here is Per Harald Borgen journey. Yes he is using Python.

I recently had to write nearly the same code in Go and Python on the same day, and I realized I had written it many times before in different languages. But it does point up some interesting language differences. This article explores many different ways to write the same code.


Nürnberg, Deutschland
Unser Technical-Debt-Monster bereitet sich auf die nächste Episode mit uns vor und dieses Mal werden wir es nicht ohne einen Fullstack Magician besiegen. Zusammen mit einem gut balancierten Team wirst Du Dich in tiefe Abgründe begeben und mit neuen Erfahrungen und Fertigkeiten herauskommen.

Upcoming Conference / User Group Meet


NakedTensor - 53 Stars, 3 Fork
Bare bottom simplest example of machine learning in TensorFlow.

tensorflow_image_classifier - 15 Stars, 4 Fork
TensorFlow Image Classifier Demo by @Sirajology on Youtube

packyou - 10 Stars, 0 Fork
Import any python project from github easily

lambdazen - 7 Stars, 2 Fork
A better python lambda syntax based on runtime in-memory source rewriting

python-twitter-toolbox - 6 Stars, 1 Fork
Twitter Toolbox for Python.

pymail - 3 Stars, 1 Fork
:mailbox_with_mail: Command-line email client

export-kobo - 3 Stars, 0 Fork
A Python tool to export annotations and highlights from a Kobo SQLite file.

October 26, 2016 01:39 AM

ImportPython Issue 89

Worthy Read

Imagine in your company slack team there's this person (we'll call him Jeff). Everything that Jeff says is patently Jeff. Maybe you've even coined a term amongst your group: a Jeffism. What if you could program a Slack bot that randomly generates messages that were undeniably Jeff?

core python
Learn how to use Python’s ternary operator to create powerful “one-liners” and enhance logical constructions of your arguments.

Byterun is a Python interpreter implemented in Python. Through my work on Byterun, I was surprised and delighted to discover that the fundamental structure of the Python interpreter fits easily into the 500-line size restriction. This chapter will walk through the structure of the interpreter and give you enough context to explore it further. The goal is not to explain everything there is to know about interpreters—like so many interesting areas of programming and computer science, you could devote years to developing a deep understanding of the topic.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

Note from curator - I met Alex at Pycon Singapore / Py APAC as it was called then, I found him inspirational. We sat down and talked about Java developer's obsession with design patterns. It was a blast. I wonder if he would remember. Here is a podcast where he is interviewed. Alex Martelli has dedicated a large part of his career to teaching others how to work with software. He has the highest number of Python questions answered on Stack Overflow, he has written and co-written a number of books on Python, and presented innumerable times at conferences in multiple countries. We spoke to him about how he got started in software, his work with Google, and the trends in development and design patterns that are shaping modern software engineering.

A Django site that integrates with Tesseract to provide an OCR service.

Tutorial on how to use messages framework.

Wow 3.x isn't far behind. Couple of years may be. I see more and more companies using 3.x series for newer projects.

GeoViews is a new Python library that makes it easy to explore and visualize geographical, meteorological, oceanographic, weather, climate, and other real-world data. GeoViews was developed by Continuum Analytics, in collaboration with the Met Office. GeoViews is completely open source, available under a BSD license freely for both commercial and non-commercial use, and can be obtained as described at the Github site.

This week we welcome Reinout van Rees (@reinoutvanrees) as our PyDev of the Week! Reinout is the creator / maintainer of zest.releaser. He has a nice website that includes a Python blog that you might want to check out. I would also recommend checking his Github page to see what projects he’s a part of. Note - We have been including Reinout van Rees blogposts for long time now in importpython. Here you can know more about the person behind the blog.

Whenever I am doing analysis with pandas my first goal is to get data into a panda’s DataFrame using one of the many available options. For the vast majority of instances, I use read_excel , read_csv , or read_sql . There are multiple methods you can use to take a standard python datastructure and create a panda’s DataFrame. For the purposes of these examples, I’m going to create a DataFrame with 3 months of sales information for 3 fictitious companies.

Go find how many you can answer

book review
Mike Driscoll's second book Python 201: Intermediate Python is out.

At PayPal, we write and deploy our fair share of Python, and we wanted to devote a couple minutes to our story and give credit where credit is due. For conclusion seekers, without doubt or further ado: Continuum Analytics’ Anaconda Python distribution has made our lives so much easier. For small- and medium-sized teams, no matter the deployment scale, Anaconda has big implications. But let’s talk about how we got here.

cssdbpy is a simple SSDB client written on Cython. Faster standart SSDB client.

Get an understanding of how to dockerize your Django application, using the Gunicorn web server, capable of serving thousands of requests in a minute.

code snippet
While using Python's os.path module in a project, I got the idea of using it to do a quick-and-dirty check for what drives exist on a Windows system. Actually, not really the physical drives, but the drive letters, that may in reality be mapped any of the following: physical hard disk drives or logical partitions of them, CD or DVD drives, USB drives, or network-mapped drives.

Note I haven't personally gone through the video series, the no of upvotes and views looks pretty decent. Please make your own judgement.

Upcoming Conference / User Group Meet


keras_snli - 77 Stars, 9 Fork
Keras model that tackles the Stanford Natural Language Inference (SNLI) corpus using summation and/or recurrent neural networks

commandlinefu_slackbot - 9 Stars, 0 Fork
This is a simple slackbot based that fetches search results from and displays them in slack. It is based on the instructions given here.

word2vec-slim - 8 Stars, 0 Fork
word2vec Google News model slimmed down to 260k English words

pyh2o - 5 Stars, 0 Fork
The pyh2o module provides Python binding for the H2O HTTP server. Currently this is a toy project, PRs are welcome to make it useful. Think of high performance, interaction with asyncio, etc.

October 26, 2016 01:39 AM

ImportPython Issue 88

Worthy Read

doctest tests source code by running examples embedded in the documentation and verifying that they produce the expected results. It works by parsing the help text to find examples, running them, then comparing the output text against the expected value. Many developers find doctest easier to use than unittest because, in its simplest form, there is no API to learn before using it.

If you like this newsletter and you are on twitter you want to follow getpy. Daily get selected ( 4 - 5 ) tweets super relevant to Python.

We deploy all Django applications with Gunicorn and Supervisor. I personally prefer Gunicorn to uWSGI because it has better configuration options and more predictable performance. In this article we will be deploying a typical Django application. We won't be using async workers because we're just serving HTML and there are no heavy-lifting task in background.

What you see here is an early version of the book.

Today, services built on Python 3.5 using asyncio are widely used at Facebook. But as recently as May of 2014 it was actually impossible to use Python 3 at Facebook. Come learn how we cut the Gordian Knot of dependencies and social aversion to the point where new services are now being written in Python 3 and existing codebases have plans to move to Python 3.5.

Covered in this episode: Test Fixtures, Subcutaneous Testing, End to End Testing (System Testing) . Curator's note - Of all the podcast out there pythontesting is my fav podcast.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

core python
Simple tutorial with code snippets on zip.

Welcome back! In this series of blog posts we are wrapping the awesome OnionScan tool and then analyzing the data that falls out of it. If you haven’t read parts one and two in this series then you should go do that first. In this post we are going to analyze our data in a new light by visualizing how hidden services are linked together as well as how hidden services are linked to clearnet sites. One of the awesome things that OnionScan does is look for links between hidden services and clearnet sites and makes these links available to us in the JSON output. Additionally it looks for IP address leaks or references to IP addresses that could be used for deanonymization.

How to deploy Django app on Google Cloud

In the four years since its initial release, many words have been spilt introducing conda and espousing its merits, but one thing I have consistently noticed is the number of misconceptions that seem to remain in the (often fervent) discussions surrounding this tool. I hope in this post to do a small part in putting these myths and misconceptions to rest.

It's a website to help you choose what movie you and your family/friends should watch together. Here is the code for the software

September is back and it's for the Montreal Python community to gather again and share exciting new technologies and projects. This month, our friends from Ubisoft are welcoming us into their offices and are going to present to us how they are using Python and how they scaling it at large to powered some of their games.

core python
Although type comments work well enough, the fact that they're expressed through comments has some downsides. The majority of these issues can be alleviated by making the syntax a core part of the language. Read the PEP to know more. I think it is a very exciting PEP.

Learn when and why you'd glue strings together using concatenation, interpolation, or other methods.


Mumbai, Maharashtra, India
Django development for both client and internal projects

Upcoming Conference / User Group Meet


srez - 133 Stars, 4 Fork
Image super-resolution through deep learning. This project uses deep learning to upscale 16x16 images by a 4x factor. The resulting 64x64 images display sharp features that are plausible based on the dataset that was used to train the neural net.

Young - 73 Stars, 6 Fork
A full-featured forum software built with love in python

httpstat - 48 Stars, 1 Fork
curl statistic made simple

NSC - 36 Stars, 5 Fork
Neural Sentiment Classification

yapi - 10 Stars, 1 Fork
Python Youtube Data API v3

imapclient - 10 Stars, 0 Fork
An easy-to-use, Pythonic and complete IMAP client library

google - 4 Stars, 0 Fork
A Python module for easily accessing Google data that sits behind a login.

json-algorithm - 4 Stars, 0 Fork
Now even your pet rock can parse JSON.

django-explain - 2 Stars, 0 Fork
A helper to get EXPLAIN or EXPLAIN ANALYZE OUTPUT for django queryset.

interview-with-python - 2 Stars, 0 Fork
The ultimate in python interview preparation and coding practice.

October 26, 2016 01:39 AM

ImportPython Issue 87

Worthy Read

Useful Youtube channel with short screencast/videos for Python developers to subscribe to. I learned on couple of sublime + Python tricks from here.

This Dockerfile shows you how to build a Docker container with a fairly standard and speedy setup for Django with uWSGI and Nginx.

curated list
I have read some interesting Python tutorials lately. I would love to share them with you.

Try Hired and get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies, Hired puts the power in your hands.

web framework
Ky?kai is a fast asynchronous Python server-side web framework. It is built upon asyncio and the Asphalt framework for an extremely fast web server.

We recently upgraded our 160,000 lines of backend Python code from Python 2 to Python 3. We did with zero downtime and no major errors! Here’s how we did it, hopefully it will help anyone else still stuck on Python 2!

Bangalore user group meet with Python Automation as the theme

Kickstarter Campaign for wxPython Cookbook.

What happens when you take a tech-driven online fashion company that is experiencing explosive growth and infuse it with a deep open-source mission? You'll find out on this episode of Talk Python To Me. We'll meet Lauri Apple and Rafael Caricio from Zalando where developers there have published almost 200 open source projects on Github.

There are many ways to handle permissions in a project. For instance we may have model level permissions, object level permissions, fine grained user permission or role based. Either way we don't need to be writing any of those from scratch, Django ecosystem has a vast amount of permission handling apps that will help us with the task. In this post we will compare how some popular permission apps work so you know which one suits your project needs.

image processing
Do you know what they are? If you are thinking of irrigation circles, you are wrong. Do not believe the lies of the conspirators. Those are, undoubtedly, proofs of extraterrestrial visitors on earth. As I want to be ready for the first contact I need to know where these guys are working. It should be easy with so many satellite images at hand. So I asked the machine learning experts around here to lend me a hand. Surprisingly, they refused. Mumbling I don’t know what about irrigation circles. Very suspicious. But something else they mentioned is that a better initial approach would be to use some computer-vision detection technique. Note - Code is here

Hopefully this post gave you some insight into why you should consider giving Python a go. This post is coming from someone who feels “guilty” for talking not so good about Python in the past and is now all over the hype train. In my defense, it was just a “personal preference thing”, when people asked me about which language they should learn first, for instance, I usually suggested Python.

Upcoming Conference / User Group Meet


fuzzer - 82 Stars, 7 Fork
A Python interface to AFL, allowing for easy injection of testcases and other functionality.

MEAnalyzer - 31 Stars, 6 Fork
Intel Engine Firmware Analysis Tool

pybble - 24 Stars, 1 Fork
Python on Pebble

tensorflow_demo - 6 Stars, 2 Fork
Tensorflow Demo for my TF in 5 Min Video on Youtube

washer - 5 Stars, 0 Fork
A whoosh-based CLI indexer and searcher for your files.

October 26, 2016 01:39 AM

ImportPython Issue 86

Worthy Read

Python packaging is not bad any more. If you’re a developer, and you’re trying to create or consume Python libraries, it can be a tractable, even pleasant experience. A historical perspective of how it's evolved and where it stands today.

new release
Python 3.6.0a4 has been released. 3.6.0a4 is the last of four planned alpha pre-releases of Python 3.6, the next major release of Python. During the alpha phase, Python 3.6 remains under heavy development: additional features will be added and existing features may be modified or deleted. Please keep in mind that this is a preview release and its use is not recommended for production environments. Python 3.6.0 is planned to be released by the end of 2016. The first beta pre-release, 3.6.0b1, is planned for 2016-09-12.

Get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies.

core python
Do you write programs in Python? You should be using attrs.

In a Django model, the Manager is the interface that interacts with the database. By default the manager is available through the Model.objects property. The default manager every Django model gets out of the box is the django.db.models.Manager. It is very straightforward to extend it and change the default manager.

core python
The dis module includes functions for working with Python bytecode by “disassembling” it into a more human-readable form. Reviewing the bytecodes being executed by the interpreter is a good way to hand-tune tight loops and perform other kinds of optimizations. It is also useful for finding race conditions in multi-threaded applications, since it can be used to estimate the point in the code where thread control may switch.

Are you the one ?

machine learning
One of the pivotal moments in my professional development this year came when I discovered Coursera. I'd heard of the "MOOC" phenomenon but had not had the time to dive in and take a class. Earlier this year I finally pulled the trigger and signed up for Andrew Ng's Machine Learning class. I completed the whole thing from start to finish, including all of the programming exercises. The experience opened my eyes to the power of this type of education platform, and I've been hooked ever since.

core python
Explains Context Manager using "Making a sandwich" as an example.

embedded systems
Wemos D1 mini is a 64x48 oled screen that can be mounted on the d1 really easily. The screen has an I2C interface and driven by a SSD1306 chip which is thankfully supported by micropython. Full details, code snippets, schematics can be found on this article.

Brandon Rhodes on Twitter: “Welcome to the new !” Thanks to the original maintainers, the new, & the PSF for this site!


White Plains, NY, United States
We are currently inviting applications from Python Programmer with analytics background for a contract with our client, a global IT consulting firm for their office location in White Plains, NY. This is a part-time position.

Bangalore, Karnataka, India
At CallHub, we help political parties and advocacy groups in their campaigns and causes, using our award winning cloud based telephony platform. Customers round the globe use CallHub to reach people quickly via phone calls and text messages. Our customers include Uber, Accenture, political parties in UK, France, Australia and the US.

Delft, Netherlands
PLAXIS applications are used to create and manipulate models of soil and structures (for example foundations of high rise office towers, tunnels, dikes). They offer diverse functionalities, such as: highly interactive CAD-like graphical user input (both 2D and 3D), geometric calculations, intuitive visualization of calculation results and interfacing with finite element kernels.

Upcoming Conference / User Group Meet


PokemonGo-TSP - 65 Stars, 11 Fork
Solving TSP with Simulated Annealing

curlify - 15 Stars, 0 Fork
A library to convert python requests request object to curl command.

buildreport - 13 Stars, 1 Fork
Github pull request summary report for android builds - 8 Stars, 0 Fork, a simple retry library

atom-tracer - 8 Stars, 0 Fork
A language agnostic Atom package for tracing variables inline!

pokemon-csv-to-map - 8 Stars, 0 Fork
A tool useful for Pokemon Go maps, such as pogom or PokemonGo-Map.

geotagger - 7 Stars, 0 Fork
Geotag your photos taken with a GPS-free camera using your smartphone location history data.

October 26, 2016 01:39 AM

ImportPython Issue 85

Worthy Read

Get in front of 4,000+ companies with one application. No more pushy recruiters, no more dead end applications and mismatched companies.

When you ask for editor recommendations as a Python developer one of the top choices you’ll hear about is Sublime Text. In this post I’ll review the status of Python development with Sublime Text as of 2016.

This week we interviewed Peter McCormick and Francis Deslauriers about their work organizing PyCon Canada to provide a venue for Canadians to talk about how they are using the language. If you happen to be near Toronto in November then you should get a ticket and help contribute to their success.

asyncpg is a new fully-featured open-source Python client library for PostgreSQL. It is built specifically for asyncio and Python 3.5 async / await. asyncpg is the fastest driver among common Python, NodeJS and Go implementations.

Flake8 is a Python library that wraps PyFlakes, pycodestyle and Ned Batchelder’s McCabe script. It is a great toolkit for checking your code base against coding style (PEP8), programming errors (like “library imported but unused” and “Undefined name”) and to check cyclomatic complexity.

(Hopefully) the future of network protocols in Python. I think it's important to promote this approach to implementing network protocols, to the point that I have created a page at to act as a reference of libraries that have followed the approach I've outlined here. Basically what this means is that network protocol libraries will need to be rewritten so that they can be used by both synchronous and asynchronous I/O .

Security is something we often ignore until it is too late. However, there are some things you can do right now that are easy to increase your security. Using django-admin-honeypot is one of those things you can do. It is super easy and provides you with the means of tracking who is trying to access your site.

django admin panel
The Django admin is a very powerful tool. We use it for day to day operations, browsing data and support. As we grew some of our projects from zero to 100K+ users we started experiencing some of Django’s admin pain points?—?long response times and heavy load on the database.

In this talk, Jess Bowden introduces the area of NLP (Natural Language Processing) and a basic introduction of its principles. She uses Python and some of its fundamental NLP packages, such as NLTK, to illustrate examples and topics, demonstrating how to get started with processing and analysing Natural Languages. She also looks at what NLP can be used for, a broad overview of the sub-topics, and how to get yourself started with a demo project.

I setup a benchmark, which can be found here to compare Python datetime, Arrow, Pendulum, Delorean and udatetime on a performance level. I picked 4 typical performance critical operations to measure the speed of those libraries. Decode a date-time string, Encode (serialize) a date-time string, Instantiate object with current time in UTC, Instantiate object with current time in local timezone, Instantiate object from timestamp in UTC, Instantiate object from timestamp in local timezone.

Mozilla recently decided to award $200,000 to Baroque Software to work on PyPy as part of its Mozilla Open Source Support (MOSS) initiative. This money will be used to implement the Python 3.5 features in PyPy. Within the next year, we plan to use the money to pay four core PyPy developers half-time to work on the missing features and on some of the big performance and cpyext issues.

Katerina Kampardi is a Web Applications Developer from Greece who works as a freelancer. Like many aspiring developers, Katerina is self-taught and got her start with online tutorials. She later attended a Python Specialization. Today, she works on various Django projects as an independent developer.

Upcoming Conference / User Group Meet


colornet - 1885 Stars, 67 Fork
Neural Network to colorize grayscale images

tflearn - 1313 Stars, 50 Fork
Deep learning library featuring a higher-level API for TensorFlow.

DeepDreamVideo - 773 Stars, 94 Fork
implementing deep dream on video

dcgan-completion.tensorflow - 54 Stars, 9 Fork
Image Completion with Deep Learning in TensorFlow - 44 Stars, 3 Fork
fasttext is a Python interface for Facebook fastText.

NBA-Player-Movements - 37 Stars, 3 Fork
Visualization of NBA games from raw SportVU logs

pic2text - 15 Stars, 12 Fork
A script to transform picture to text

NaiveBayesClassifier - 3 Stars, 1 Fork
Naive bayes classifier implement with Python 2.7

October 26, 2016 01:39 AM