skip to navigation
skip to content

Planet Python

Last update: December 19, 2014 01:45 PM

December 19, 2014


Django Weblog

DjangoCon Europe 2015

2015's DjangoCon Europe will take place in Cardiff, Wales, from the 2nd to the 7th June, for six days of talks, tutorials and code. Here’s a snapshot of what's in store.

Six whole days

For the first time ever, we’re holding a six-day DjangoCon.

The event will begin with an open day. All sessions on the open day - talks and tutorials of various different kinds - will be free, and open to the public. (This follows the example of Django Weekend Cardiff, where the open day proved hugely successful.)

The open day will:

  • help introduce Django to a wider audience
  • give newer members of the community a headstart, to help them get the most from the following five days

There’ll be a DjangoGirls workshop on our open day, and lots more besides.

Two days of code

Following the three days of talks, we won’t just have two days of sprints: they will be two days of code: code sprints, code clinics and workshops. We want everyone to have a reason to stay on after the talks, and participate, whatever their level of experience.

All these sessions will be free of charge to conference attendees. Some of them will be worth the ticket price of the entire conference on their own.

Values

We aim to put on a first-class technical conference. We also want the event to embody three very important values:

Diversity

We aim to make DjangoCon Europe 2015 a milestone in the Django community's effort to improve diversity.

Accessibility

We want this DjangoCon to set the highest standards for accessibility, and to ensure that we do not inadvertently exclude anyone from participating fully in the event.

Social responsibility

DjangoCon Europe 2015 expresses the Django community's values of fairness, respect and consideration as undertakings of social responsibility.

Cardiff

We’re sure you’ll enjoy the city and our venues, and we're looking forward to welcoming you in June. Here's some comprehensive information on how to get to Cardiff.

Social events

We have a number of social events planned - more details will be published soon.

Registration, call for papers and other key milestones

Here's a list of key dates. Ticket prices will be published when registration opens.

And finally

We're seeking sponsorship, of course, and would love to hear from any organisations that can contribute financial support to the event.

There's contact information on the website. If there's anything you want to know, you need just ask - the organising committee are at your disposal.

December 19, 2014 09:34 AM


Python Anywhere

New PythonAnywhere update: Mobile, UI, packages, reliability, and the dreaded EU VAT change

We released a bunch of updates to PythonAnywhere today :-) Short version: we've made some improvements to the iPad and Android experience, applied fixes to our in-browser console, added a bunch of new pre-installed packages, done a big database upgrade that should make unplanned outages rarer and shorter, and made changes required by EU VAT legislation (EU customers will soon be charged their local VAT rate instead of UK VAT).

Here are the details:

iPad and Android

User interface

New packages

We've added loads of new packages to our "batteries included" list:

Additionally, for people who like alternative shells to the ubiquitous bash, we've added fish.

Reliability improvement

We've upgraded one of our underlying infrastructural databases to SSD storage. We've had a couple of outages recently caused by problems with this database, which were made much worse by the fact that it took a long time to start up after a failover. Moving it to SSD moved it to new hardware (which we think will make it less likely to fail) and will also mean that if it does fail, it should recover much faster.

EU VAT changes

For customers outside the EU, this won't change anything. But for non-business customers inside the EU, starting 1 January 2015, we'll be charging you VAT at the rate for your country, instead of using the UK VAT rate of 20%. This is the result of some (we think rather badly-thought-through) new EU legislation. We'll write an extended post about this sometime soon.

December 19, 2014 09:23 AM


Gaël Varoquaux

PRNI 2016: call for organization

PRNI (Pattern Recognition for NeuroImaging) is an IEEE conference about applying pattern recognition and machine learning to brain imaging. It is a mid-sized conference (about 150 attendee), and is a satellite of OHBM (the annual “Human Brain Mapping” meeting).

The steering committee is calling for bids to organize the conference in June 2016, in Europe, as a satellite the OHBM meeting in Geneva.

December 19, 2014 01:38 AM

December 18, 2014


Invent with Python

Cover for "Automate the Boring Stuff with Python" is Released

The final cover design for my next book, Automate the Boring Stuff with Python, is available on the No Starch Press website.

Cover of Automate the Boring Stuff with Python

I'm super excited. :D

December 18, 2014 11:59 PM


Omaha Python Users Group

December Meeting Notes

We had a great meeting which included a raffle of the awesome Python book, Learning Python 4th Edition.  Thanks to O’Reilly Publishing.

Further topics included, in no particular order:

Merry Christmas to all!

December 18, 2014 11:51 PM


Graham Dumpleton

Apache/mod_wsgi for Windows on life support.

The critics will say that Apache/mod_wsgi as a whole is on life support and not worth saving. I have a history of creating Open Source projects that no one much wants to use, or get bypassed over time, but I am again having lots of fun with Apache/mod_wsgi so I don't particularly care what they may have to say right now. I do have to say though, that right now it looks like continued support for

December 18, 2014 10:41 PM


Chris Moffitt

Combining Data From Multiple Excel Files

Introduction

A common task for python and pandas is to automate the process of aggregating data from multiple files and spreadsheets.

This article will walk through the basic flow required to parse multiple Excel files, combine the data, clean it up and analyze it. The combination of python + pandas can be extremely powerful for these activities and can be a very useful alternative to the manual processes or painful VBA scripts frequently used in business settings today.

The Problem

Before, I get into the examples, here is a simple diagram showing the challenges with the common process used in businesses all over the world to consolidate data from multiple Excel files, clean it up and perform some analysis.

Excel file processing

If you’re reading this article, I suspect you have experienced some of the problems shown above. Cutting and pasting data or writing painful VBA code will quickly get old. There has to be a better way!

Python + pandas can be a great alternative that is much more scaleable and powerful.

Excel file processing with pandas

By using a python script, you can develop a more streamlined and repeatable solution to your data processing needs. The rest of this article will show a simple example of how this process works. I hope it will give you ideas of how to apply these tools to your unique situation.

Collecting the Data

If you are interested in following along, here are the excel files and a link to the notebook:

The first step in the process is collecting all the data into one place.

First, import pandas and numpy

import pandas as pd
import numpy as np

Let’s take a look at the files in our input directory, using the convenient shell commands in ipython.

!ls ../in
address-state-example.xlsx  report.xlsx                sample-address-new.xlsx
customer-status.xlsx            sales-feb-2014.xlsx    sample-address-old.xlsx
excel-comp-data.xlsx            sales-jan-2014.xlsx    sample-diff-1.xlsx
my-diff-1.xlsx                  sales-mar-2014.xlsx    sample-diff-2.xlsx
my-diff-2.xlsx                  sample-address-1.xlsx  sample-salesv3.xlsx
my-diff.xlsx                    sample-address-2.xlsx
pricing.xlsx                    sample-address-3.xlsx

There are a lot of files, but we only want to look at the sales .xlsx files.

!ls ../in/sales*.xlsx
../in/sales-feb-2014.xlsx  ../in/sales-jan-2014.xlsx  ../in/sales-mar-2014.xlsx

Use the python glob module to easily list out the files we need.

import glob
glob.glob("../in/sales*.xlsx")
['../in/sales-jan-2014.xlsx',
 '../in/sales-mar-2014.xlsx',
 '../in/sales-feb-2014.xlsx']

This gives us what we need. Let’s import each of our files and combine them into one file. Panda’s concat and append can do this for us. I’m going to use append in this example.

The code snippet below will initialize a blank DataFrame then append all of the individual files into the all_data DataFrame.

all_data = pd.DataFrame()
for f in glob.glob("../in/sales*.xlsx"):
    df = pd.read_excel(f)
    all_data = all_data.append(df,ignore_index=True)

Now we have all the data in our all_data DataFrame. You can use describe to look at it and make sure you data looks good.

all_data.describe()
account number quantity unit price ext price
count 1742.000000 1742.000000 1742.000000 1742.000000
mean 485766.487945 24.319173 54.985454 1349.229392
std 223750.660792 14.502759 26.108490 1094.639319
min 141962.000000 -1.000000 10.030000 -97.160000
25% 257198.000000 12.000000 32.132500 468.592500
50% 527099.000000 25.000000 55.465000 1049.700000
75% 714466.000000 37.000000 77.607500 2074.972500
max 786968.000000 49.000000 99.850000 4824.540000

A lot of this data may not make much sense for this data set but I’m most interested in the count row to make sure the number of data elements makes sense. In this case, I see all the data rows I expect.

all_data.head()
account number name sku quantity unit price ext price date
0 740150 Barton LLC B1-20000 39 86.69 3380.91 2014-01-01 07:21:51
1 714466 Trantow-Barrows S2-77896 -1 63.16 -63.16 2014-01-01 10:00:47
2 218895 Kulas Inc B1-69924 23 90.70 2086.10 2014-01-01 13:24:58
3 307599 Kassulke, Ondricka and Metz S1-65481 41 21.05 863.05 2014-01-01 15:05:22
4 412290 Jerde-Hilpert S2-34077 6 83.21 499.26 2014-01-01 23:26:55

It is not critical in this example but the best practice is to convert the date column to a date time object.

all_data['date'] = pd.to_datetime(all_data['date'])

Combining Data

Now that we have all of the data into one DataFrame, we can do any manipulations the DataFrame supports. In this case, the next thing we want to do is read in another file that contains the customer status by account. You can think of this as a company’s customer segmentation strategy or some other mechanism for identifying their customers.

First, we read in the data.

status = pd.read_excel("../in/customer-status.xlsx")
status
account number name status
0 740150 Barton LLC gold
1 714466 Trantow-Barrows silver
2 218895 Kulas Inc bronze
3 307599 Kassulke, Ondricka and Metz bronze
4 412290 Jerde-Hilpert bronze
5 729833 Koepp Ltd silver
6 146832 Kiehn-Spinka silver
7 688981 Keeling LLC silver
8 786968 Frami, Hills and Schmidt silver
9 239344 Stokes LLC gold
10 672390 Kuhn-Gusikowski silver
11 141962 Herman LLC gold
12 424914 White-Trantow silver
13 527099 Sanford and Sons bronze
14 642753 Pollich LLC bronze
15 257198 Cronin, Oberbrunner and Spencer gold

We want to merge this data with our concatenated data set of sales. Use panda’s merge function and tell it to do a left join which is similar to Excel’s vlookup function.

all_data_st = pd.merge(all_data, status, how='left')
all_data_st.head()
account number name sku quantity unit price ext price date status
0 740150 Barton LLC B1-20000 39 86.69 3380.91 2014-01-01 07:21:51 gold
1 714466 Trantow-Barrows S2-77896 -1 63.16 -63.16 2014-01-01 10:00:47 silver
2 218895 Kulas Inc B1-69924 23 90.70 2086.10 2014-01-01 13:24:58 bronze
3 307599 Kassulke, Ondricka and Metz S1-65481 41 21.05 863.05 2014-01-01 15:05:22 bronze
4 412290 Jerde-Hilpert S2-34077 6 83.21 499.26 2014-01-01 23:26:55 bronze

This looks pretty good but let’s look at a specific account.

all_data_st[all_data_st["account number"]==737550].head()
account number name sku quantity unit price ext price date status
9 737550 Fritsch, Russel and Anderson S2-82423 14 81.92 1146.88 2014-01-03 19:07:37 NaN
14 737550 Fritsch, Russel and Anderson B1-53102 23 71.56 1645.88 2014-01-04 08:57:48 NaN
26 737550 Fritsch, Russel and Anderson B1-53636 42 42.06 1766.52 2014-01-08 00:02:11 NaN
32 737550 Fritsch, Russel and Anderson S1-27722 20 29.54 590.80 2014-01-09 13:20:40 NaN
42 737550 Fritsch, Russel and Anderson S1-93683 22 71.68 1576.96 2014-01-11 23:47:36 NaN

This account number was not in our status file, so we have a bunch of NaN’s. We can decide how we want to handle this situation. For this specific case, let’s label all missing accounts as bronze. Use the fillna function to easily accomplish this on the status column.

all_data_st['status'].fillna('bronze',inplace=True)
all_data_st.head()
account number name sku quantity unit price ext price date status
0 740150 Barton LLC B1-20000 39 86.69 3380.91 2014-01-01 07:21:51 gold
1 714466 Trantow-Barrows S2-77896 -1 63.16 -63.16 2014-01-01 10:00:47 silver
2 218895 Kulas Inc B1-69924 23 90.70 2086.10 2014-01-01 13:24:58 bronze
3 307599 Kassulke, Ondricka and Metz S1-65481 41 21.05 863.05 2014-01-01 15:05:22 bronze
4 412290 Jerde-Hilpert S2-34077 6 83.21 499.26 2014-01-01 23:26:55 bronze

Check the data just to make sure we’re all good.

all_data_st[all_data_st["account number"]==737550].head()
account number name sku quantity unit price ext price date status
9 737550 Fritsch, Russel and Anderson S2-82423 14 81.92 1146.88 2014-01-03 19:07:37 bronze
14 737550 Fritsch, Russel and Anderson B1-53102 23 71.56 1645.88 2014-01-04 08:57:48 bronze
26 737550 Fritsch, Russel and Anderson B1-53636 42 42.06 1766.52 2014-01-08 00:02:11 bronze
32 737550 Fritsch, Russel and Anderson S1-27722 20 29.54 590.80 2014-01-09 13:20:40 bronze
42 737550 Fritsch, Russel and Anderson S1-93683 22 71.68 1576.96 2014-01-11 23:47:36 bronze

Now we have all of the data along with the status column filled in. We can do our normal data manipulations using the full suite of pandas capability.

Using Categories

One of the relatively new functions in pandas is support for categorical data. From the pandas, documentation:

Categoricals are a pandas data type, which correspond to categorical variables in statistics: a variable, which can take on only a limited, and usually fixed, number of possible values (categories; levels in R). Examples are gender, social class, blood types, country affiliations, observation time or ratings via Likert scales.

For our purposes, the status field is a good candidate for a category type.

Version Warning
You must make sure you have a recent version of pandas ( > 0.15) installed for this example to work.
pd.__version__
'0.15.2'

First, we typecast it the column to a category using astype .

all_data_st["status"] = all_data_st["status"].astype("category")

This doesn’t immediately appear to change anything yet.

all_data_st.head()
account number name sku quantity unit price ext price date status
0 740150 Barton LLC B1-20000 39 86.69 3380.91 2014-01-01 07:21:51 gold
1 714466 Trantow-Barrows S2-77896 -1 63.16 -63.16 2014-01-01 10:00:47 silver
2 218895 Kulas Inc B1-69924 23 90.70 2086.10 2014-01-01 13:24:58 bronze
3 307599 Kassulke, Ondricka and Metz S1-65481 41 21.05 863.05 2014-01-01 15:05:22 bronze
4 412290 Jerde-Hilpert S2-34077 6 83.21 499.26 2014-01-01 23:26:55 bronze

Buy you can see that it is a new data type.

all_data_st.dtypes
account number             int64
name                      object
sku                       object
quantity                   int64
unit price               float64
ext price                float64
date              datetime64[ns]
status                  category
dtype: object

Categories get more interesting when you assign order to the categories. Right now, if we call sort on the column, it will sort alphabetically.

all_data_st.sort(columns=["status"]).head()
account number name sku quantity unit price ext price date status
1741 642753 Pollich LLC B1-04202 8 95.86 766.88 2014-02-28 23:47:32 bronze
1232 218895 Kulas Inc S1-06532 29 42.75 1239.75 2014-09-21 11:27:55 bronze
579 527099 Sanford and Sons S1-27722 41 87.86 3602.26 2014-04-14 18:36:11 bronze
580 383080 Will LLC B1-20000 40 51.73 2069.20 2014-04-14 22:44:58 bronze
581 383080 Will LLC S2-10342 15 76.75 1151.25 2014-04-15 02:57:43 bronze

We use set_categories to tell it the order we want to use for this category object. In this case, we use the Olympic medal ordering.

all_data_st["status"].cat.set_categories([ "gold","silver","bronze"],inplace=True)

Now, we can sort it so that gold shows on top.

all_data_st.sort(columns=["status"]).head()
account number name sku quantity unit price ext price date status
0 740150 Barton LLC B1-20000 39 86.69 3380.91 2014-01-01 07:21:51 gold
1193 257198 Cronin, Oberbrunner and Spencer S2-82423 23 52.90 1216.70 2014-09-09 03:06:30 gold
1194 141962 Herman LLC B1-86481 45 52.78 2375.10 2014-09-09 11:49:45 gold
1195 257198 Cronin, Oberbrunner and Spencer B1-50809 30 51.96 1558.80 2014-09-09 21:14:31 gold
1197 239344 Stokes LLC B1-65551 43 15.24 655.32 2014-09-10 11:10:02 gold

Analyze Data

The final step in the process is to analyze the data. Now that it is consolidated and cleaned, we can see if there are any insights to be learned.

all_data_st["status"].describe()
count       1742
unique         3
top       bronze
freq         764
Name: status, dtype: object

For instance, if you want to take a quick look at how your top tier customers are performaing compared to the bottom. Use groupby to get the average of the values.

all_data_st.groupby(["status"])["quantity","unit price","ext price"].mean()
quantity unit price ext price
status
gold 24.680723 52.431205 1325.566867
silver 23.814241 55.724241 1339.477539
bronze 24.589005 55.470733 1367.757736

Of course, you can run multiple aggregation functions on the data to get really useful information

all_data_st.groupby(["status"])["quantity","unit price","ext price"].agg([np.sum,np.mean, np.std])
quantity unit price ext price
sum mean std sum mean std sum mean std
status
gold 8194 24.680723 14.478670 17407.16 52.431205 26.244516 440088.20 1325.566867 1074.564373
silver 15384 23.814241 14.519044 35997.86 55.724241 26.053569 865302.49 1339.477539 1094.908529
bronze 18786 24.589005 14.506515 42379.64 55.470733 26.062149 1044966.91 1367.757736 1104.129089

So, what does this tell you? Well, the data is completely random but my first observation is that we sell more units to our bronze customers than gold. Even when you look at the total dollar value associated with bronze vs. gold, it looks odd that we sell more to bronze customers than gold.

Maybe we should look at how many bronze customers we have and see what is going on?

What I plan to do is filter out the unique accounts and see how many gold, silver and bronze customers there are.

I’m purposely stringing a lot of commands together which is not necessarily best practice but does show how powerful pandas can be. Feel free to review my previous article here and here to understand it better. Play with this command yourself to understand how the commands interact.

all_data_st.drop_duplicates(subset=["account number","name"]).ix[:,[0,1,7]].groupby(["status"])["name"].count()
status
gold      4
silver    7
bronze    9
Name: name, dtype: int64

Ok. This makes a little more sense. We see that we have 9 bronze customers and only 4 customers. That is probably why the volumes are so skewed towards our bronze customers. This result makes sense given the fact that we defaulted to bronze for many of our customers. Maybe we should reclassify some of them? Obviously this data is fake but hopefully this shows how you can use these tools to quickly analyze your own data.

Conclusion

This example only covered the aggregation of 4 simple Excel files containing random data. However the principles can be applied to much larger data sets yet you can keep the code base very manageable. Additionally, you have the full power of python at your fingertips so you can do much more than just simply manipulate the data.

I encourage you to try some of these concepts out on your scenarios and see if you can find a way to automate that painful Excel task that hangs over your head every day, week or month.

Good luck!

December 18, 2014 03:20 AM


Invent with Python

Programming a Bot to Play the "Sushi Go Round" Flash Game

This tutorial teaches how to write a bot that can automatically play the Flash game Sushi Go Round. The concepts in this tutorial can be applied to make bots that play other games as well. It's inspired by the How to Build a Python Bot That Can Play Web Games by Chris Kiehl. The primary improvement of this tutorial is it uses the cross-platform PyAutoGUI module to control the mouse and take screenshots. It is documented on its ReadTheDocs page.

Sushi Go Round is a resource management game similar to "Dine N Dash". You fill customer orders for different types of sushi and place them on the conveyor belt. Incoming customers may take other customer's sushi orders, forcing you to remake orders. Customers who wait too long to get orders end up leaving, costing you reputation. Angry customers can be placated with saki, but this bot does not make use of that feature. Ingredients will have to be ordered as they get low.

I've refined this bot so that it can successfully play all the way through the game, ending up with a score of about 38,000. The top scores are a little over 100,000, so there is still room for improvement with this bot. You can watch a YouTube video of the bot playing.

The source code and images for the bot can be downloaded here or viewed on GitHub.

Step 1 - Install PyAutoGUI and Download Images

First, install PyAutoGUI by downloading it from PyPI or installing it through the pip program that comes with Python 3. If you've installed PyAutoGUI before, use the -U option to update it to the latest version. This tutorial uses Python 3, though PyAutoGUI and the bot code work with either Python 2 or 3.

You can check if PyAutoGUI has been installed correctly by running import pyautogui in the interactive shell. Full documentation for PyAutoGUI is available at http://pyautogui.readthedocs.org.

PyAutoGUI provides basic image recognition by finding the screen coordinates of a provided image. To save yourself time of taking screenshots and making these images yourself, download them from the zip file here. All of the images should be stored in an /images folder:

Step 2 - Basic Setup

Create a sushigoround.py file and enter the following code:

#! python3
"""Sushi Go Round Bot
Al Sweigart al@inventwithpython.com @AlSweigart

A bot program to automatically play the Sushi Go Round flash game at http://miniclip.com/games/sushi-go-round/en/
"""

import pyautogui, time, os, logging, sys, random, copy

logging.basicConfig(level=logging.DEBUG, format='%(asctime)s.%(msecs)03d: %(message)s', datefmt='%H:%M:%S')
#logging.disable(logging.DEBUG) # uncomment to block debug log messages

This generic code sets up a shebang line, some comments to describe the program, imports for several modules, and setting up logging so that the logging.debug() function will output debug messages. (I explain why logging is preferable to print() calls in this blog post.

Step 3 - Constants Setup

Next, there are several variable constants set up for this program. I use constants as a way to detect typos; mistyping a string value as 'califrnia_roll' will cause bugs but Python won't directly point out the typo. Mistyping a variable name as CALIFRNIA_ROLL will cause Python to raise a NameError exception, letting you quickly fix it.

First, set up constants for each type of order in the game, as well as a tuple of all the orders:

# Food order constants (don't change these: the image filenames depend on these specific values)
ONIGIRI = 'onigiri'
GUNKAN_MAKI = 'gunkan_maki'
CALIFORNIA_ROLL = 'california_roll'
SALMON_ROLL = 'salmon_roll'
SHRIMP_SUSHI = 'shrimp_sushi'
UNAGI_ROLL = 'unagi_roll'
DRAGON_ROLL = 'dragon_roll'
COMBO = 'combo'
ALL_ORDER_TYPES = (ONIGIRI, GUNKAN_MAKI, CALIFORNIA_ROLL, SALMON_ROLL, SHRIMP_SUSHI, UNAGI_ROLL, DRAGON_ROLL, COMBO)

Next, set up constants for each of the ingredients and the recipes for each order. (This tutorial saves you from having to look up the recipe information in the game yourself.)

# Ingredient constants (don't change these: the image filenames depend on these specific values)
SHRIMP = 'shrimp'
RICE = 'rice'
NORI = 'nori'
ROE = 'roe'
SALMON = 'salmon'
UNAGI = 'unagi'
RECIPE = {ONIGIRI:         {RICE: 2, NORI: 1},
          CALIFORNIA_ROLL: {RICE: 1, NORI: 1, ROE: 1},
          GUNKAN_MAKI:     {RICE: 1, NORI: 1, ROE: 2},
          SALMON_ROLL:     {RICE: 1, NORI: 1, SALMON: 2},
          SHRIMP_SUSHI:    {RICE: 1, NORI: 1, SHRIMP: 2},
          UNAGI_ROLL:      {RICE: 1, NORI: 1, UNAGI: 2},
          DRAGON_ROLL:     {RICE: 2, NORI: 1, ROE: 1, UNAGI: 2},
          COMBO:           {RICE: 2, NORI: 1, ROE: 1, SALMON: 1, UNAGI: 1, SHRIMP: 1},}

Next are several constants and global variables used in the program. Although using global variables is generally a bad idea for making maintainable programs, this Python script is just a one off and I didn't want to make it complicated for newer programmers who don't yet know object-oriented programming concepts. The variables are described in the comments:

LEVEL_WIN_MESSAGE = 'win' # checkForGameOver() returns this value if the level has been won

# Settings
MIN_INGREDIENTS = 4 # if an ingredient gets below this value, order more
PLATE_CLEARING_FREQ = 8 # plates are cleared every this number of seconds, roughly
NORMAL_RESTOCK_TIME = 7 # the number of seconds it takes to restock inventory after ordering it (at normal speed, not express)
TIME_TO_REMAKE = 30 # if an order goes unfilled for this number of seconds, remake it

# Global variables
LEVEL = 1 # current level being played
INVENTORY = {SHRIMP: 5, RICE: 10,
             NORI: 10,  ROE: 10,
             SALMON: 5, UNAGI: 5}
GAME_REGION = () # (left, top, width, height) values coordinates of the game window
ORDERING_COMPLETE = {SHRIMP: None, RICE: None, NORI: None, ROE: None, SALMON: None, UNAGI: None} # unix timestamp when an ordered ingredient will have arrived
ROLLING_COMPLETE = 0 # unix timestamp of when the rolling of the mat will have completed
LAST_PLATE_CLEARING = 0 # unix timestamp of the last time the plates were cleared
LAST_GAME_OVER_CHECK = 0 # unix timestamp when we last checked for the Game Over or You Win messages

Note: a "unix timestamp" is the number of seconds since midnight, January 1, 1970. It is returned by the time.time() function. For example, if time.time() is called at 10:36:54.392 PM on December 16, 2014, it will return the value 1418798214.392. Ten seconds later it will return the value 1418798224.392. Subtracting these values is how the bot can measure how much time has passed.

And finally set up some global variables which will store coordinates for various things in the game window. The GAME_REGION variable stores the position of the Sushi Go Round game itself. Once that value has been acquired, the setupCoordinates() function (described later) will populate the values in these variables:

# various coordinates of objects in the game
GAME_REGION = () # (left, top, width, height) values coordinates of the entire game window
INGRED_COORDS = None
PHONE_COORDS = None
TOPPING_COORDS = None
ORDER_BUTTON_COORDS = None
RICE1_COORDS = None
RICE2_COORDS = None
NORMAL_DELIVERY_BUTTON_COORDS = None
MAT_COORDS = None

Step 4 - The main() Function

The main() ties together the code that runs from the very start of the bot program. It assumes that the Sushi Go Round game is visible on the screen and at the opening start screen (which has the flashing "PLAY" button).

Because PyAutoGUI takes control of the mouse, it can be hard to get the program's terminal window back in focus to shut it down. PyAutoGUI has a fail-safe feature; if you move the mouse cursor to the top-left corner of the screen PyAutoGUI will raise an exception and terminate the program.

def main():
    """Runs the entire program. The Sushi Go Round game must be visible on the screen and the PLAY button visible."""
    logging.debug('Program Started. Press Ctrl-C to abort at any time.')
    logging.debug('To interrupt mouse movement, move mouse to upper left corner.')
    getGameRegion()
    navigateStartGameMenu()
    setupCoordinates()
    startServing()

Step 5 - Image Recognition and Finding the Game Screen

Computer vision and optical character recognition (OCR) are major topics of computer science. The good thing about Sushi Go Round is that you don't need to know any of that because the game uses static 2D sprites. A california roll in the game will always look the same down to the pixel. So we can use PyAutoGUI's pixel-recognition functions to "see" where on the screen different objects are.

The pyautogui.locateOnScreen() function takes a string filename argument and returns a (left, top, width, height) tuple of integer coordinates for where the image is found on the screen. (The "left" value is the X-coordinate of the left edge, the "top" value is the Y-coordinate of the top edge. With the XY coordinates of the top-left corner and the width and height, you can completely describe the region.)

If the image isn't found on the screen, the function returns None. An optional region keyword argument can specify a smaller region of the screen to search (a smaller region means faster detection). Without a region specified, the entire screen is searched.

The location of the game window itself will be stored later in the GAME_REGION variable, which holds a (left, top, width, height) tuple of integers. (The (left, top, width, height) tuples are used throughout PyAutoGUI.) These values are visualized in the following graphic:

To find the game window, we search for the top-right corner image, which is stored in top_right_corner.png. (These image files are included in the zip file download.) Once this image is located on the screen, the GAME_REGION variable can be populated.

def imPath(filename):
    """A shortcut for joining the 'images/'' file path, since it is used so often. Returns the filename with 'images/' prepended."""
    return os.path.join('images', filename)


def getGameRegion():
    """Obtains the region that the Sushi Go Round game is on the screen and assigns it to GAME_REGION. The game must be at the start screen (where the PLAY button is visible)."""
    global GAME_REGION

    # identify the top-left corner
    logging.debug('Finding game region...')
    region = pyautogui.locateOnScreen(imPath('top_right_corner.png'))
    if region is None:
        raise Exception('Could not find game on screen. Is the game visible?')

    # calculate the region of the entire game
    topRightX = region[0] + region[2] # left + width
    topRightY = region[1] # top
    GAME_REGION = (topRightX - 640, topRightY, 640, 480) # the game screen is always 640 x 480
    logging.debug('Game region found: %s' % (GAME_REGION,))

Note that running PyAutoGUI's image recongition takes a relatively long time to execute (several hundred milliseconds). This can be mitigated by searching a smaller region instead of the entire screen. And since some objects will always appear in the same place in the game window, you can rely on using static integer coordinates instead of image recognition to find them.

Step 6 - Dealing with Coordinates

Many of the buttons in the game's user interface will always be in the same location. The coordinates for these buttons can be pre-programmed ahead of time. You could take a screenshot, paste the image into a graphics program such as Photoshop or MS Paint, and find all of the XY coordinates yourself. But this tutorial has done this step for you. The other coordinate global variables are populated in the setupCoordinates() function. Note that they are all relative to where the game window (whose coordinates are in GAME_REGION) is.

For example, the coordinates for the Salmon ordering button are set to (GAME_REGION[0] + 496, GAME_REGION[1] + 329). This is visualized in the following graphic:

def setupCoordinates():
    """Sets several of the coordinate-related global variables, after acquiring the value for GAME_REGION."""
    global INGRED_COORDS, PHONE_COORDS, TOPPING_COORDS, ORDER_BUTTON_COORDS, RICE1_COORDS, RICE2_COORDS, NORMAL_DELIVERY_BUTTON_COORDS, MAT_COORDS, LEVEL
    INGRED_COORDS = {SHRIMP: (GAME_REGION[0] + 40, GAME_REGION[1] + 335),
                     RICE:   (GAME_REGION[0] + 95, GAME_REGION[1] + 335),
                     NORI:   (GAME_REGION[0] + 40, GAME_REGION[1] + 385),
                     ROE:    (GAME_REGION[0] + 95, GAME_REGION[1] + 385),
                     SALMON: (GAME_REGION[0] + 40, GAME_REGION[1] + 425),
                     UNAGI:  (GAME_REGION[0] + 95, GAME_REGION[1] + 425),}
    PHONE_COORDS = (GAME_REGION[0] + 560, GAME_REGION[1] + 360)
    TOPPING_COORDS = (GAME_REGION[0] + 513, GAME_REGION[1] + 269)
    ORDER_BUTTON_COORDS = {SHRIMP: (GAME_REGION[0] + 496, GAME_REGION[1] + 222),
                           UNAGI:  (GAME_REGION[0] + 578, GAME_REGION[1] + 222),
                           NORI:   (GAME_REGION[0] + 496, GAME_REGION[1] + 281),
                           ROE:    (GAME_REGION[0] + 578, GAME_REGION[1] + 281),
                           SALMON: (GAME_REGION[0] + 496, GAME_REGION[1] + 329),}
    RICE1_COORDS = (GAME_REGION[0] + 543, GAME_REGION[1] + 294)
    RICE2_COORDS = (GAME_REGION[0] + 545, GAME_REGION[1] + 269)

    NORMAL_DELIVERY_BUTTON_COORDS = (GAME_REGION[0] + 495, GAME_REGION[1] + 293)

    MAT_COORDS = (GAME_REGION[0] + 190, GAME_REGION[1] + 375)

    LEVEL = 1

This tutorial does the tedious coordinate-finding work for you, but if you ever need to do it yourself you can use the pyautogui.displayMousePosition() function to display the XY coordinates of the mouse cursor. The coordinates update as you move the mouse, so you can move the mouse to the desired location on the screen, then view its current coordinates. The displayMousePosition() function also shows that pixel's RGB color value.

You can also pass X and Y offsets to the function to set the (0, 0) origin somewhere besides the top-left corner of the screen. When you are done, press Ctrl-C to return from the function.

Step 7 - Controlling the Mouse

PyAutogui has a pyautogui.click() function that can be passed a (x, y) tuple argument of the coordinates on the screen to click on. Often you can use the return value of pyautogui.locateCenterOnScreen() for this argument. The click is done immediately, but the optional duration keyword argument will specify the number of seconds PyAutoGUI spends moving to the (x, y) coordinate. A small delay makes it easier to visually follow along with the bot's clicking.

Since the SKIP button in the game is flashing, the pyautogui.locateCenterOnScreen(imPath('skip_button.png'), region=GAME_REGION) call might not be able to find it if it is flashing in the other color than skip_button.png has. This is why there is a while loop that keeps searching until it finds it. Remember, locateCenterOnScreen() will return None if it can't find the image on the screen.

def navigateStartGameMenu():
    """Performs the clicks to navigate form the start screen (where the PLAY button is visible) to the beginning of the first level."""
    # Click on everything needed to get past the menus at the start of the game.

    # click on Play
    logging.debug('Looking for Play button...')
    while True: # loop because it could be the blue or pink Play button displayed at the moment.
        pos = pyautogui.locateCenterOnScreen(imPath('play_button.png'), region=GAME_REGION)
        if pos is not None:
            break
    pyautogui.click(pos, duration=0.25)
    logging.debug('Clicked on Play button.')

    # click on Continue
    pos = pyautogui.locateCenterOnScreen(imPath('continue_button.png'), region=GAME_REGION)
    pyautogui.click(pos, duration=0.25)
    logging.debug('Clicked on Continue button.')

    # click on Skip
    logging.debug('Looking for Skip button...')
    while True: # loop because it could be the yellow or red Skip button displayed at the moment.
        pos = pyautogui.locateCenterOnScreen(imPath('skip_button.png'), region=GAME_REGION)
        if pos is not None:
            break
    pyautogui.click(pos, duration=0.25)
    logging.debug('Clicked on Skip button.')

    # click on Continue
    pos = pyautogui.locateCenterOnScreen(imPath('continue_button.png'), region=GAME_REGION)
    pyautogui.click(pos, duration=0.25)
    logging.debug('Clicked on Continue button.')

Once the buttons at the start of the game have been navigated past, the main part of the bot code can begin.

Step 8 - The startServing() Function

The startServing() function handles all of the main game play. This includes:

Much of this functionality is passed on to other functions, which are explained later. The first part of startServing() sets up all the variables for a new game:

def startServing():
    """The main game playing function. This function handles all aspects of game play, including identifying orders, making orders, buying ingredients and other features."""
    global LAST_GAME_OVER_CHECK, INVENTORY, ORDERING_COMPLETE, LEVEL

    # Reset all game state variables.
    oldOrders = {}
    backOrders = {}
    remakeOrders = {}
    remakeTimes = {}
    LAST_GAME_OVER_CHECK = time.time()
    ORDERING_COMPLETE = {SHRIMP: None, RICE: None, NORI: None,
                         ROE: None, SALMON: None, UNAGI: None}

The next part scans the region where customers orders are displayed:

This is handled by the getOrders() function. Since you'll be repeatedly scanning for the orders but don't want to remake orders each time you scan them, you need to find out which new orders have appeared since the last scan, and which orders have disappeared since the last scan (either because the customer is eating or has left). This is handled by the getOrdersDifference().

The keys in these "orders dictionaries" will be the (left, top, width, height) tuple of where the order image was identified. This is just to keep the customers distinct. The values will be the strings in one of the ingredient constants like CALIFORNIA_ROLL or ONIGIRI. The remakeTimes dictionary has similar keys, but it's values are unix timestamps (returned from time.time()) for when the dish should be remade if it is still being requested by the customer.

    while True:
        # Check for orders, see which are new and which are gone since last time.
        currentOrders = getOrders()
        added, removed = getOrdersDifference(currentOrders, oldOrders)
        if added != {}:
            logging.debug('New orders: %s' % (list(added.values())))
            for k in added:
                remakeTimes[k] = time.time() + TIME_TO_REMAKE
        if removed != {}:
            logging.debug('Removed orders: %s' % (list(removed.values())))
            for k in removed:
                del remakeTimes[k]

The next part goes through the remakeTimes dictionary and checks if any of those timestamps are before the current time (as returned by time.time()). In that case, add this order to the remakeOrders dictionary (which is similar but separate to the added orders dictionary of new orders.)

        # Check if the remake times have past, and add those to the remakeOrders dictionary.
        for k, remakeTime in copy.copy(remakeTimes).items():
            if time.time() > remakeTime:
                remakeTimes[k] = time.time() + TIME_TO_REMAKE # reset remake time
                remakeOrders[k] = currentOrders[k]
                logging.debug('%s added to remake orders.' % (currentOrders[k]))

Next, the program loops through all the newly requested orders in the added dictionary and attempts to make them. The makeOrder() function (described later) will return None if the order was successfully made. Otherwise, it will return a string of the ingredient it doesn't have enough of. In that case, orderIngredient() (described later) is called and the order is placed on the backOrders dictionary.

        # Attempt to make the order.
        for pos, order in added.items():
            result = makeOrder(order)
            if result is not None:
                orderIngredient(result)
                backOrders[pos] = order
                logging.debug('Ingredients for %s not available. Putting on back order.' % (order))

When a customer picks up their meal, they'll spend a few seconds eating and then leave behind a dirty dish. Click this dirty dish so that a new customer will take that seat. Instead of using image recognition (which is slow), the bot can just occassionally click on all six plates. There's no penalty to clicking on the plate when the customer is eating or if there is no plate at that seat.

Since this doesn't need to be done frequently, the if random.randint(1, 10) == 1 or time.time() - PLATE_CLEARING_FREQ > LAST_PLATE_CLEARING: statement will only do this roughly once every 10 iterations, or if more than 6 seconds (the value in PLATE_CLEARING_FREQ) has passed since the last plate clearing (according to the timestamp in LAST_PLATE_CLEARING).

The actual clicking is done by the clickOnPlates() function, described later.

        # Clear any finished plates.
        if random.randint(1, 10) == 1 or time.time() - PLATE_CLEARING_FREQ > LAST_PLATE_CLEARING:
            clickOnPlates()

It's easy to keep track of ingredient quanties as you make dishes. Each level starts with a set quantity of ingredients, encoded in the INVENTORY = {SHRIMP: 5, RICE: 10, NORI: 10, ROE: 10, SALMON: 5, UNAGI: 5} lines. You can subtract values in the INVENTORY dictionary as you use ingredients.

When you order ingredients they don't arrive immediately. If you used image recognition for the numbers in the lower left corner of the game window it would require grabbing tons of screenshots and slow down the bot. Instead, since orders take less than 7 seconds to arrive (which is why the NORMAL_RESTOCK_TIME is set to 7), the ORDERING_COMPLETE can store unix timestamps for 7 seconds in the future when INVENTORY can be updated.

The values in ORDERING_COMPLETE are None if they are not currently being ordered and delivered.

The next part of code checks if the current time (as returned by time.time()) has passed. If so, the INVENTORY dictionary's values are updated.

        # Check if ingredient orders have arrived.
        updateInventory()

        # Go through and see if any back orders can be filled.
        for pos, order in copy.copy(backOrders).items():
            result = makeOrder(order)
            if result is None:
                del backOrders[pos] # remove from back orders
                logging.debug('Filled back order for %s.' % (order))

The remakeOrders dictionary contains orders that (for whatever reason) didn't make it to the customer. This code is similar to the previous dish-making code:

        # Go through and see if any remake orders can be filled.
        for pos, order in copy.copy(remakeOrders).items():
            if pos not in currentOrders:
                del remakeOrders[pos]
                logging.debug('Canceled remake order for %s.' % (order))
                continue
            result = makeOrder(order)
            if result is None:
                del remakeOrders[pos] # remove from remake orders
                logging.debug('Filled remake order for %s.' % (order))

The next part checks the INVENTORY dictionary to see if there is less than 4 (the value stored in the MIN_INGREDIENTS constant) of any ingredient. If so, more is ordered by calling the orderIngredient() function (described later). This isn't something that needs to be checkeded frequently, so it is enclosed in an if statement that runs 1 in 5 times:

        if random.randint(1, 5) == 1:
            # order any ingredients that are below the minimum amount
            for ingredient, amount in INVENTORY.items():
                if amount < MIN_INGREDIENTS:
                    orderIngredient(ingredient)

The next part checks for the "You Win" or "You Fail" message. Since this check doesn't need to happen frequently at all, the if time.time() - 12 > LAST_GAME_OVER_CHECK: statement only runs it once every 12 seconds. LAST_GAME_OVER_CHECK is updated with the current time from time.time() every time the check is performed. The checkForGameOver() function (described later) terminates the program if the player has lost, or returns the string in the LEVEL_WIN_MESSAGE constant if the level has been beaten.

If the level has been beaten, the code resets many of the variables:

        # check for the "You Win" or "You Fail" messages
        if time.time() - 12 > LAST_GAME_OVER_CHECK:
            result = checkForGameOver()
            if result == LEVEL_WIN_MESSAGE:
                # player has completed the level

                # Reset inventory and orders.
                INVENTORY = {SHRIMP: 5, RICE: 10,
                             NORI: 10, ROE: 10,
                             SALMON: 5, UNAGI: 5}
                ORDERING_COMPLETE = {SHRIMP: None, RICE: None,
                                     NORI: None, ROE: None,
                                     SALMON: None, UNAGI: None}
                backOrders = {}
                remakeOrders = {}
                currentOrders = {}
                oldOrders = {}

Also, the bot will let the user view the end-level stats for 5 seconds before clicking Continue to move on to the next level.

                logging.debug('Level %s complete.' % (LEVEL))
                LEVEL += 1
                time.sleep(5) # give another 5 seconds to tally score

                # Click buttons to continue to next level.
                pos = pyautogui.locateCenterOnScreen(imPath('continue_button.png'), region=GAME_REGION)
                pyautogui.click(pos, duration=0.25)
                logging.debug('Clicked on Continue button.')
                pos = pyautogui.locateCenterOnScreen(imPath('continue_button.png'), region=GAME_REGION)

For every level except for the last, there will be a second Continue button to click on. On the last level, the bot doesn't do this so the user can view the game ending.

                if LEVEL <= 7: # click the second continue if the game isn't finished.
                    pyautogui.click(pos, duration=0.25)
                    logging.debug('Clicked on Continue button.')

        oldOrders = currentOrders

Step 9 - Clearing the Plates

The clickOnPlates() function blindly clicks on all six places where there could be a dirty plate to clean up. This works because there is no penalty for clicking when there isn't a dirty plate, freeing the bot from having to use performance-costly image recognition to first identify the dirty plate.

The LAST_PLATE_CLEARING global variable contains the timestamp of the last time the plates were cleared. This is used to determine if a long time has passed since the plates have been cleared.

def clickOnPlates():
    """Clicks the mouse on the six places where finished plates will be flashing. This function does not check for flashing plates, but simply clicks on all six places.

    Sets LAST_PLATE_CLEARING to the current time."""
    global LAST_PLATE_CLEARING

    # just blindly click on all the places where a plate should be
    for i in range(6):
        pyautogui.click(83 + GAME_REGION[0] + (i * 101), GAME_REGION[1] + 203)
    LAST_PLATE_CLEARING = time.time()

Step 10 - Getting the Orders from Customers

Customer orders appear as dish images in word bubbles above their heads. PyAutoGUI's image recognition can be used to scan this region of the game screen and match them to the images in california_roll_order.png, gunkan_maki_order.png, onigiri_order.png, and so on:

The getOrders() function finds all the current orders being requested. Figuring out which of these orders have been seen before (and possible already fulfilled) is done by the getOrdersDifference() function, explained in the next section.

def getOrders():
    """Scans the screen for orders being made. Returns a dictionary with a (left, top, width, height) tuple of integers for keys and the order constant for a value.

    The order constants are ONIGIRI, GUNKAN_MAKI, CALIFORNIA_ROLL, SALMON_ROLL, SHRIMP_SUSHI, UNAGI_ROLL, DRAGON_ROLL, COMBO."""
    orders = {}
    for orderType in (ALL_ORDER_TYPES):
        allOrders = pyautogui.locateAllOnScreen(imPath('%s_order.png' % orderType), region=(GAME_REGION[0] + 32, GAME_REGION[1] + 46, 558, 44))
        for order in allOrders:
            orders[order] = orderType
    return orders

The newOrders and oldOrders parameters have values that were returned from different calls to getOrders(). The orders that are new to newOrders are returned in the added dictionaries and the orders that have since been removed are returned in the removed dictionary.

def getOrdersDifference(newOrders, oldOrders):
    """Finds the differences between the orders dictionaries passed. Return value is a tuple of two dictionaries.

    The first dictionary is the "added" dictionary of orders added to newOrders since oldOrders. The second dictionary is the "removed" dictionary of orders in oldOrders but removed in newOrders.

    Each dictionary has (left, top, width, height) for keys and an order constant for a value."""
    added = {}
    removed = {}

    # find all orders in newOrders that are new and not found in oldOrders
    for k in newOrders:
        if k not in oldOrders:
            added[k] = newOrders[k]
    # find all orders in oldOrders that were removed and not found in newOrders
    for k in oldOrders:
        if k not in newOrders:
            removed[k] = oldOrders[k]

    return added, removed

Step 11 - Making the Orders

Creating a dish involves clicking on the correct ingredients and then clicking on the rolling mat. The dish is place on a conveyor belt and eventually makes its way to a customer. The first part of the function is just the docstring and a few global statements.

def makeOrder(orderType):
    """Does the mouse clicks needed to create an order.

    The orderType parameter has the value of one of the ONIGIRI, GUNKAN_MAKI, CALIFORNIA_ROLL, SALMON_ROLL, SHRIMP_SUSHI, UNAGI_ROLL, DRAGON_ROLL, COMBO constants.

    The INVENTORY global variable is updated in this function for orders made.

    The return value is None for a successfully made order, or the string of an ingredient constant if that needed ingredient is missing."""
    global ROLLING_COMPLETE, INGRED_COORDS, INVENTORY

The next part of the code ensures that the bot doesn't try to make a dish when it's not possible. One of the reasons is that the mat is still in the process of rolling the previous dish. Also, a dish will stay on the mat after being rolled if there isn't space on the conveyor belt. (This happens when orders aren't picked up by a customer.)

This is an important check: if the bot starts clicking on ingredients when the rolling mat is occupied, then the bot's INVENTORY dictionary will contain an inaccurate count of ingredients, leading to other mistakes and eventually losing the game.

The mat takes about 1.5 seconds to complete, which is why the ROLLING_COMPLETE = time.time() + 1.5 is run after clicking the mat to begin rolling it. This is a simple check that doesn't take long to run.

But if the conveyor belt is occupied, then the previous order could still be on the mat waiting for a clear space on the belt. After performing the cheap ROLLING_TIME check, the bot then uses pyautogui.locateOnScreen() to see if the mat is clear. This guarantees that when the bot starts clicking on ingredients they will be moved to the mat.

    # wait until the mat is clear. The previous order could still be there if the conveyor belt has been full or the mat is currently rolling.
    while time.time() < ROLLING_COMPLETE and pyautogui.locateOnScreen(imPath('clear_mat.png'), region=(GAME_REGION[0] + 115, GAME_REGION[1] + 295, 220, 175)) is None:
        time.sleep(0.1)

But before the bot begins making a dish, it consults the INVENTORY dictionary to make sure it has enough of each of the ingredients in the dish's recipe. The recipes for each dish are held in the RECIPES constant.

If there isn't enough of an ingredient, the string in the ingredient's constant is returned. This information will be used to place an order for more of that ingredient.

    # check that all ingredients are available in the inventory.
    for ingredient, amount in RECIPE[orderType].items():
        if INVENTORY[ingredient] < amount:
            logging.debug('More %s is needed to make %s.' % (ingredient, orderType))
            return ingredient

Once it has been confirmed there is enough of each ingredient, calls to pyautogui.click() cause the bot to click on the appropriate ingredients, then click on the mat to roll it. The ROLLING_COMPLETE global variable is updated with the timestamp of when the mat rolling will be finished (1.5 seconds in the future) so the next dish can be prepared.

Just before the rolling mat is clicked though, the bot will check the conveyor belt for any excess dishes that weren't picked up by customers. While this is a waste of ingredients, these excess dishes tend to pile up (especially in the later levels), and slow down the bot from serving dishes to the point that it could lose the game. The call to findAndClickPlatesOnBelt() (described next) gets rid of any excess dishes that happen to be on the conveyor belt.

    # click on each of the ingredients
    for ingredient, amount in RECIPE[orderType].items():
        for i in range(amount):
            pyautogui.click(INGRED_COORDS[ingredient], duration=0.25)
            INVENTORY[ingredient] -= 1
    findAndClickPlatesOnBelt() # get rid of any left over meals on the conveyor belt, which may stall this meal from being loaded on the belt
    pyautogui.click(MAT_COORDS, duration=0.25) # click the rolling mat to make the order
    logging.debug('Made a %s order.' % (orderType))
    ROLLING_COMPLETE = time.time() + 1.5 # give the mat enough time (1.5 seconds) to finish rolling before being used again

Sometimes other customers will swoop in and grab dishes that were intended for other customers. To prevent the customers from leaving, the remakeTimes dictionary keeps track of when orders will need to be remade. However, if the customer leaves before picking up this remade dish, then it will circle endlessly on the conveyor belt. Eventually these excess dishes can pile up and stall new dishes from being added. The findAndClickPlatesOnBelt() will scan the conveyor belt next to the rolling mat for any dishes and click on them to remove them.

It does this by identifying the pink, blue, or red coloring of the plates. These colors are stored in the pink_plate_color.png, blue_plate_color.png, red_plate_color.png images.

def findAndClickPlatesOnBelt():
    """Find any plates on the conveyor belt that can be removed and click on them to remove them. This will get rid of excess orders."""
    for color in ('pink', 'blue', 'red'):
        result = pyautogui.locateCenterOnScreen(imPath('%s_plate_color.png' % (color)), region=(GAME_REGION[0] + 343, GAME_REGION[1] + 300, 50, 80))
        if result is not None:
            pyautogui.click(result)
            logging.debug('Clicked on %s plate on belt at X: %s Y: %s' % (color, result[0], result[1]))

Step 12 - Ordering More Ingredients

When the bot needs more ingredients, it must navigate the menu buttons on the phone in the lower right corner of the game window. This tutorial has already figured out these coordinates for you, but you could use the pyautogui.displayMousePosition() function to find coordinates for any pixel on the screen.

The first step is to click on the phone:

def orderIngredient(ingredient):
    """Do the clicks to purchase an ingredient. If successful, the ORDERING_COMPLETE dictionary is updated for when the ingredients will arive and INVENTORY can be updated. (This is handled in the updateInventory() function.)"""
    logging.debug('Ordering more %s (inventory says %s left)...' % (ingredient, INVENTORY[ingredient]))
    pyautogui.click(PHONE_COORDS, duration=0.25)

The next part of the code handles pressing the buttons to order rice. The timestamp in ORDERING_COMPLETE is checked to make sure a previous rice order hasn't been made. Then the screen is scanned for the cant_afford_rice.png image. The rice ordering button takes on a dimmed appearance if the player does not have enough money to order rice. In this case, the function returns.

    if ingredient == RICE and ORDERING_COMPLETE[RICE] is None:
        # Order rice.
        pyautogui.click(RICE1_COORDS, duration=0.25)

        # Check if we can't afford the rice
        if pyautogui.locateOnScreen(imPath('cant_afford_rice.png'), region=(GAME_REGION[0] + 498, GAME_REGION[1] + 242, 90, 75)):
            logging.debug("Can't afford rice. Canceling.")
            pyautogui.click(GAME_REGION[0] + 585, GAME_REGION[1] + 335, duration=0.25) # click cancel phone button
            return

Otherwise, the bot will click on the rice button. The code doesn't update INVENTORY quite yet, as it will take time for the rice to be delivered. (For simplicity, the bot will never use the Express Delivery option, though you could certainly build in the intelligence to do that.) The expected delivery time is set in the ORDERING_COMPLETE dictionary.

The INVENTORY will be updated with the new quantity in the updateInventory() function.

        # Purchase the rice
        pyautogui.click(RICE2_COORDS, duration=0.25)
        pyautogui.click(NORMAL_DELIVERY_BUTTON_COORDS, duration=0.25)
        ORDERING_COMPLETE[RICE] = time.time() + NORMAL_RESTOCK_TIME
        logging.debug('Ordered more %s' % (RICE))
        return

The next part does something similar except it navigates the phone menu for the non-rice ingredients. The same general logic is used; if an ingredient isn't affordable, the dimmed button image will inform the bot that it can't be purchased.

    elif ORDERING_COMPLETE[ingredient] is None:
        # Order non-rice ingredient.
        pyautogui.click(TOPPING_COORDS, duration=0.25)

        # Check if we can't afford the ingredient
        if pyautogui.locateOnScreen(imPath('cant_afford_%s.png' % (ingredient)), region=(GAME_REGION[0] + 446, GAME_REGION[1] + 187, 180, 180)):
            logging.debug("Can't afford %s. Canceling." % (ingredient))
            pyautogui.click(GAME_REGION[0] + 597, GAME_REGION[1] + 337, duration=0.25) # click cancel phone button
            return

Note that the bot doesn't have to keep track of how much money it has. It only needs to know if it can afford an ingredient or not. For simplicity, the "normal delivery" option is always used instead of the costly "express delivery" option:

        # Order the ingredient
        pyautogui.click(ORDER_BUTTON_COORDS[ingredient], duration=0.25)
        pyautogui.click(NORMAL_DELIVERY_BUTTON_COORDS, duration=0.25)
        ORDERING_COMPLETE[ingredient] = time.time() + NORMAL_RESTOCK_TIME
        logging.debug('Ordered more %s' % (ingredient))
        return

If it turns out the bot couldn't afford the desired ingredient, the next part will close the phone menu by clicking on the coordinates for the "hang up" button:

    # The ingredient has already been ordered, so close the phone menu.
    pyautogui.click(GAME_REGION[0] + 589, GAME_REGION[1] + 341) # click cancel phone button
    logging.debug('Already ordered %s.' % (ingredient))

The updateInventory() function is frequently called to check if the timestamps in ORDERING_COMPLETE indicate that the ordered ingredients have arrived. In that case, it updates INVENTORY with the new values. Shrimp, unagi, and salmon deliveries always add 5. Nori, roe, and rice deliveries always add 10.

def updateInventory():
    """Check if any ordered ingredients have arrived by looking at the timestamps in ORDERING_COMPLETE.
    Update INVENTORY global variable with the new quantities."""
    for ingredient in INVENTORY:
        if ORDERING_COMPLETE[ingredient] is not None and time.time() > ORDERING_COMPLETE[ingredient]:
            ORDERING_COMPLETE[ingredient] = None
            if ingredient in (SHRIMP, UNAGI, SALMON):
                INVENTORY[ingredient] += 5
            elif ingredient in (NORI, ROE, RICE):
                INVENTORY[ingredient] += 10
            logging.debug('Updated inventory with added %s:' % (ingredient))
            logging.debug(INVENTORY)

Step 13 - Checking for the End of the Level

The bot can tell when the level is over by doing image recognition for the you_win.png or you_fail.png images. If the game is over, the program terminates by calling sys.exit(). If the level has been beaten, the bot clicks on the "You Win" window so that it immediately tallies the stats and returns the string in the LEVEL_WIN_MESSAGE.

def checkForGameOver():
    """Checks the screen for the "You Win" or "You Fail" message.

    On winning, returns the string in LEVEL_WIN_MESSAGE.

    On losing, the program terminates."""

    # check for "You Win" message
    result = pyautogui.locateOnScreen(imPath('you_win.png'), region=(GAME_REGION[0] + 188, GAME_REGION[1] + 94, 262, 60))
    if result is not None:
        pyautogui.click(pyautogui.center(result))
        return LEVEL_WIN_MESSAGE

    # check for "You Fail" message
    result = pyautogui.locateOnScreen(imPath('you_failed.png'), region=(GAME_REGION[0] + 167, GAME_REGION[1] + 133, 314, 39))
    if result is not None:
        logging.debug('Game over. Quitting.')
        sys.exit()

Finally, the main() function is called if this script is being run (as opposed to imported as a module):

if __name__ == '__main__':
    main()

That's it!

If you download and run the bot, it doesn't have much problem beating all 7 levels of Sushi Go Round. But as you can tell from the leaderboard, if doesn't compare to the best human players. Feel free to modify the code to try to improve the performance of the bot.

Sushi Go Round is a good candidate for creating a bot since it's 2D sprite graphics are static and easy to identify from screenshots. If you find any other similar such games, please list them in the comments below! I've found that the "Skill" genre on Newgrounds, "puzzle", "gathering", and "rhythm" games are usually bot-friendly since their images are easier to identify. (There's a list of Flash game genres here.) Look for ones that have retro pixel graphics. In particular, these games seem like good candidates for making bots:

And some trickier games, but still possible to make bots for:

Update: roddds on Reddit posted a link to his own Sushi Go Round bot he had made previously. He also has code and a video for a Diamond Dash-playing bot.

December 18, 2014 02:01 AM


Vasudev Ram

Tortilla, a Python API wrapper

By Vasudev Ram



tortilla is a Python library for wrapping APIs. It's headline says "Wrapping web APIs made easy."

It can be installed with:
pip install tortilla
I tried it out, and slightly modified an example given in its documentation, to give this:
# test_tortilla.py
import tortilla
github = tortilla.wrap('https://api.github.com')
user = github.users.get('redodo')
for key in user:
print key, ":", user[key]
That code uses the Github API (wrapped by tortilla) to get the information for user redodo, who is the creator of tortilla.
Here is the output of running:
python test_tortilla.py
bio : None
site_admin : False
updated_at : 2014-12-17T16:39:55Z
gravatar_id :
hireable : True
id : 2227416
followers_url : https://api.github.com/users/redodo/followers
following_url : https://api.github.com/users/redodo/following{/other_user}
blog :
followers : 6
location : Kingdom of the Netherlands
type : User
email : dodo@gododo.co
public_repos : 9
events_url : https://api.github.com/users/redodo/events{/privacy}
company :
gists_url : https://api.github.com/users/redodo/gists{/gist_id}
html_url : https://github.com/redodo
subscriptions_url : https://api.github.com/users/redodo/subscriptions
received_events_url : https://api.github.com/users/redodo/received_events
starred_url : https://api.github.com/users/redodo/starred{/owner}{/repo}
public_gists : 0
name : Hidde Bultsma
organizations_url : https://api.github.com/users/redodo/orgs
url : https://api.github.com/users/redodo
created_at : 2012-08-27T13:03:15Z
avatar_url : https://avatars.githubusercontent.com/u/2227416?v=3
repos_url : https://api.github.com/users/redodo/repos
following : 2
login : redodo
Adding:
print type(user)
to the end of test_tortilla.py, shows that the user object is of type bunch.Bunch.

Bunch is a Python module providing "a dictionary that supports attribute-style access, a la JavaScript."

Did you know that tortillas are roughly similar to rotis?

- Vasudev Ram - Dancing Bison Enterprises

Signup for news about new products from me.

Contact Page

December 18, 2014 12:10 AM

December 17, 2014


Python(x,y) News

Python(x, y) 2.7.9.0 Released!

Hi All,

I'm happy to announce that Python(x, y) 2.7.9.0 is available for immediate download.from any of the mirrors. The full change log can be viewed here. Please post your comments and suggestions to the mailing list.


Work on the 64 bit & 3.x versions progresses slowly but surely. They will happen eventually :)

What's new in general:

New noteworthy packages:

Have fun!

-Gabi Davar

December 17, 2014 10:27 PM


Omaha Python Users Group

December 17 Meeting Details

Meeting is same location as last week:

Location - Alley Poyner Macchietto Architecture Office in the Tip Top Building at 1516 Cuming Street.

Meeting starts at 7pm, Wednesday, 12/17/2014

Parking and entry details:

Steve will be at the door from 6:45 til 7:05, or you can email the list to let him know you are at the door.
1516 Cuming Street
NE Corner of 16th and Cuming.
The red area on the map is reserved for Alley Poyner and you can park in those spots if the front parking lot is full.
apma parking

 

December 17, 2014 09:45 PM


A. Jesse Jiryu Davis

It Seemed Like A Good Idea At The Time: MongoReplicaSetClient

Road

The road to hell is paved with good intentions.

I'm writing post mortems for four regrettable decisions in PyMongo, the standard Python driver for MongoDB. Each of these decisions made life painful for Bernie Hackett and me—PyMongo's maintainers—and confused our users. This winter we're preparing PyMongo 3.0, and we have the chance to fix them all. As I snip out these regrettable designs I ask, what went wrong?

I conclude the series with the final regrettable decision: MongoReplicaSetClient.


The Beginning

In January of 2011, Bernie Hackett was maintaining PyMongo single-handedly. PyMongo's first author Mike Dirolf had left, and I hadn't yet joined.

Replica sets had been released in MongoDB 1.6 the year before, in 2010. They obsoleted the old "master-slave replication" system, which didn't do automatic failover if the master machine died. In replica sets, if the primary dies the secondaries elect a new primary at once.

PyMongo 2.0 had one client class, called Connection. By the time our story begins, Bernie had added most of the replica-set features Connection needed. Given a replica set name and the addresses of one or more members, it could discover the whole set and connect to the primary. For example, with a three-node set and the primary on port 27019:

>>> # Obsolete code.
>>> from pymongo import Connection
>>> c = Connection('localhost:27017,localhost:27018',
...                replicaset='repl0',
...                safe=True)
>>> c
Connection([u'localhost:27019', 'localhost:27017', 'localhost:27018'])
>>> c.port  # Current primary's port.
27019

If there was a failover, Connection's next operation failed, but it found and connected to the primary on the operation after that:

>>> c.db.collection.insert({})
error: [Errno 61] Connection refused
>>> c.db.collection.insert({})
ObjectId('548ef36eca1ce90d91000007')
>>> c.port  # What port is the new primary on?
27018

(Note that PyMongo 2.0 threw a socket error after a failover: we consistently wrap errors in our ConnectionFailure exception class now.)

Reading From Secondaries

The Connection class's replica set features were pretty well-rounded, actually. But a user asked Bernie for a new feature: he wanted a convenient way to query from secondaries. Our Ruby and Node drivers supported this feature using a different connection class. So in late 2011, just as I was joining the company, Bernie wrote a new class, ReplicaSetConnection. Depending on your read preference, it would read from the primary or a secondary:

>>> from pymongo import ReplicaSetConnection, ReadPreference
>>> rsc = ReplicaSetConnection(
...    'localhost:27017,localhost:27018',
...    replicaset='repl0',
...    read_preference=ReadPreference.SECONDARY,
...    safe=True)

Besides distributing reads to secondaries, the new ReplicaSetConnection had another difference from Connection: a monitor thread. Every 30 seconds, the thread proactively updated its view of the replica set's topology. This gave ReplicaSetConnection two advantages. First, it could detect when a new secondary had joined the set, and start using it for reads. Second, even if it was idle during a failover, after 30 seconds it would detect the new primary and use it for the next operation, instead of throwing an error on the first try.

ReplicaSetConnection was mostly the same as the existing Connection class. But it was different enough that there was some risk: the new code might have new bugs. Or at least, it might have surprising differences from Connection's behavior.

PyMongo has special burdens, since it's the intersection between two huge groups: MongoDB users and the Python world, possibly the largest language community in history. These days PyMongo is downloaded half a million times a month, and back then its stats were big, too. So Bernie tread very cautiously. He didn't force you to use the new code right away. Instead, he made a separate class you could opt in to. He released ReplicaSetConnection in PyMongo 2.1.

The Curse

But we never merged the two classes.

Ever since November 2011, when Bernie wrote ReplicaSetConnection and I joined MongoDB, we've maintained ReplicaSetConnection's separate code. It gained features. It learned to run mapreduce jobs on secondaries. Its read preference options expanded to include members' network latency and tags. Connection gained distinct features, too, diverging further from ReplicaSetConnection: it can connect to the nearest mongos from a list of them, and fail over to the next if that mongos goes down. Other features applied equally to both classes, so we wrote them twice. We had two tests for most of these features. When we renamed Connection to MongoClient, we also renamed ReplicaSetConnection to MongoReplicaSetClient. And still, we didn't merge them.

The persistent, slight differences between the two classes persistently confused our users. I remember my feet aching as I stood at our booth at PyCon in 2013, explaining to a user when he should use MongoClient and when he should use MongoReplicaSetClient—and I remember his expression growing sourer each minute as he realized how irrational the distinction was.

I explained it again during MongoDB Office Hours, when I sat at a cafeteria table with a couple users, soon after we moved to the office in Times Square. And again, I saw the frustration on their faces. I explained it on Stack Overflow a couple months later. I've been explaining this for as long as I've worked here.

The Curse Is Lifted

This year, two events conspired to kill MongoReplicaSetClient. First, we resolved to write a PyMongo 3.0 with a cleaned-up API. Second, I wrote the Server Discovery And Monitoring Spec, a comprehensive description of how all our drivers should connect to a standalone server, a set of mongos servers, or a replica set. This spec closely followed the design of our Java and C# drivers, which never had a ReplicaSetConnection. These drivers each have a single class that connects to any kind of MongoDB topology.

Since the Server Discovery And Monitoring Spec provides the algorithm to connect to any topology with the same class, I just followed my spec and wrote a unified MongoClient for PyMongo 3. For the sake of backwards compatibility, MongoReplicaSetClient lives a while longer as an empty, deprecated subclass of MongoClient.

The new MongoClient has many advantages over both its ancestors. Mainly, it's concurrent: it connects to all the servers in your deployment in parallel. It runs your operations as soon as it finds any suitable server, while it continues to discover the rest of the deployment using background threads. Since it discovers and monitors all servers in parallel, it isn't hampered by a down server, or a distant one. It will be responsive even with the very large replica sets that will be possible in MongoDB 2.8, or the even larger ones we may someday allow.

Unifying the two classes also makes MongoDB URIs more powerful. Let's say you develop your Python code against a standalone mongod on your laptop, then you test in a staging environment with a replica set, then deploy to a sharded cluster. If you set the URI with a config file or environment variable, you had to write code like this:

# PyMongo 2.x.
from pymongo.uri_parse import parse_uri

uri = os.environ['MONGODB_URI']
if 'replicaset' in parse_uri(uri)['options']:
    client = MongoReplicaSetClient(uri)
else:
    client = MongoClient(uri)

This is annoying. Now, the URI controls everything:

# PyMongo 3.0.
client = MongoClient(os.environ['MONGODB_URI'])

Configuration and code are properly separated.

The Moral Of The Story

I need your help—what is the moral? What should we have done differently?

When Bernie added read preferences and a monitor thread to PyMongo, I understand why he didn't overhaul the Connection class itself. The new code needed a shakedown cruise before it could be the default. You ask, "Why not publish a beta?" Few people install betas of PyMongo. Customers do thoroughly test early releases of the MongoDB server, but for PyMongo they just use the official release. So if we published a beta and received no bug reports, that wouldn't prove anything.

Bernie wanted the new code exercised. So it needed to be in a release. He had to commit to an API, so he published ReplicaSetConnection alongside Connection. Once ReplicaSetConnection was published it had to be supported forever. And worse, we had to maintain the small differences between Connection and ReplicaSetConnection, for backwards compatibility.

Maybe the moment to merge them was when we introduced MongoClient in late 2012. You had to choose to opt into MongoClient, so we could have merged the two classes into one new class, instead of preserving the distinction and creating MongoReplicaSetClient. But the introduction of MongoClient was complex and urgent; we didn't have time to unify the classes, too. It was too much risk at once.

I think the moral is: cultivate beta testers. That's what I did with Motor, my asynchronous driver for Tornado and MongoDB. It had long alpha and beta phases where I pressed developers to try it. I found PyMongo and AsyncMongo users and asked them to try switching to Motor. I kept a list of Motor testers and checked in with them occasionally. I ate my own hamster food: I used Motor to build the blog you're reading. Once I had some reports of Motor in production, and I saw it mentioned on Stack Overflow, and I discovered projects that depended on Motor in GitHub, I figured I had users and it was time for an official release.

Not all these methods will work for an established project like PyMongo, but still: for PyMongo 3.0, we should ask our community to help shake out the bugs.

When the beta is ready, will you help?


This is the final installment in my four-part series on regrettable decisions we made with PyMongo.

December 17, 2014 09:35 PM


BioPython News

Biopython 1.65 released


Dear Biopythoneers,

Source distributions and Windows installers for Biopython 1.65 are now available from the downloads page on the official Biopython website and from the Python Package Index (PyPI).

This release of Biopython supports Python 2.6, 2.7, 3.3 and 3.4. It is also tested on PyPy 2.0 to 2.4, PyPy3 version 2,4, and Jython 2.7b2.

The most visible change is that the Biopython sequence objects now use string comparison, rather than Python’s object comparison. This has been planned for a long time with warning messages in place (under Python 2, the warnings were sadly missing under Python 3).

The Bio.KEGG and Bio.Graphics modules have been expanded with support for the online KEGG REST API, and parsing, representing and drawing KGML pathways.

The Pterobranchia Mitochondrial genetic code has been added to Bio.Data (and the translation functionality), which is the new NCBI genetic code table 24.

The Bio.SeqIO parser for the ABI capillary file format now exposes all the raw data in the SeqRecord’s annotation as a dictionary. This allows further in-depth analysis by advanced users.

Bio.SearchIO QueryResult objects now allow Hit retrieval using its alternative IDs (any IDs listed after the first one, for example as used with the NCBI BLAST NR database).

Bio.SeqUtils.MeltingTemp has been rewritten with new functionality.

The new experimental module Bio.CodonAlign has been renamed Bio.codonalign (and similar lower case PEP8 style module names have been used for the sub-modules within this).

Bio.SeqIO.index_db(…) and Bio.SearchIO.index_db(…) now store any relative filenames relative to the index file, rather than (as before) relative to the current directory at the time the index was built. This makes the indexes less fragile, so that they can be used from other working directories. NOTE: This change is backward compatible (old index files work as before), however relative paths in new indexes will not work on older versions of Biopython!

Behind the scenes, we have done a lot of work applying PEP8 coding styles to Biopython, and improving the formatting of the source code documentation (PEP257 docstrings).

Many thanks to the Biopython developers and community for making this release possible, especially the following contributors:

This is a longer list of contributors and changes than usual, but it was also a longer gap since our last release.

December 17, 2014 09:06 PM


Kushal Das

On contribution

Contribution, a word a I first heard in 2004. I was a student back then. I first contribution was to the KDE l10n project with help from Ankur Bangla project. More than anything else it was fun. All the people I first met were already doing many things, they used to do more work than talking, they still do a lot more work than many people I know.

The last 10 years

Over the last 10 years the scenario of FOSS movement in India has changed. Contributors used to be the rock stars. The people just starting always wanted to become contributors. But a new word has taken the place, evangelist. Now everyone wants to become an evangelist. Most of the students I meet during conferences come to me and introduce themselves as evangelists of either Open Source or some other FOSS project, they do only talking and all of them want to change the world. But sadly, none of them seem to want to contribute back.

How to contribute?

I can understand, contributing is difficult in many cases. One needs some amount of preparation and some commitment to contribute to any project. That takes time, cannot be done overnight.

To begin with, you have to spend more time in reading than anything else. Read more documentation, read more source code, read more meeting minutes of the project you want to contribute in. Remember one thing, one always reads more source code than writing. But if you are just starting, you can spend more time in writing code too.

Try to get involved in the discussions of the project. Join the IRC channel, stay there. In the beginning you may not understand all the conversations in the channel, but keep a note of the things people are discussing. You can read about them later, use a tiny and shiny site called google.com :)

I know new students have a tendency of trying to solve non-programming bugs. But as most of you are in Engineering background, you should focus in programming more than anything else.

At home, try to find the things you do in computer in steps and repeatedly/regularly. Try to write small programs which can do those tasks for you. One of my first proper project was a small GUI application using which I used to upload photos to flickr.com, via emails.

When working on some other big project, try to solve easy bugs at first. These days all projects somehow mark easy bugs in their bug tracker. In case you can not find one, ask in the IRC channel for help. Remember that IRC is asynchronous, you may not get the answer right away. If someone is helping you, you may want to ask their timezone.

I am not saying doing work in other parts of the project is less meaningful. I personally want you to write more code than anything else. That way we will get more developers from India.

What about translation and documentation?

If you look at the people who contribute with translations or documentation, you will find few common things. Like they all love their language, they love writing. As I said before even my first contributions were translations. But neither me or any anyone else that time used to do this for some goodies or ticket to any conference. We love our mother tongue and we love to use the computer in our language, period. If you are doing translations, then do it for the love of the language and fun. Please do not do this for some stickers or release parties.

What about becoming an evangelist?

Before you start calling yourself an evangelist, you should learn about that project. You will have to spend a lot of time to learn about the technology behind, you will have to learn why some decisions were taken. The evangelist is a person who cares and believes in the project and most importantly, knows the project intimately. [S]he knows the developers behind the project, constantly talk, blog and spread the news about the project. If you look at the established evangelists, you will find mostly veterans who spent a lot of time contributing to the project first. It is not about the age of the person, but more about the time [s]he spent in the project. Btw, if you want to call yourself a developer evangelist, first become a developer of that project. That means some real code, not some examples.

December 17, 2014 10:53 AM


Daniel Nouri

Using convolutional neural nets to detect facial keypoints tutorial

This is a hands-on tutorial on deep learning. Step by step, we'll go about building a solution for the Facial Keypoint Detection Kaggle challenge. The tutorial introduces Lasagne, a new library for building neural networks with Python and Theano. We'll use Lasagne to implement a couple of network architectures, talk about data augmentation, dropout, the importance of momentum, and pre-training. Some of these methods will help us improve our results quite a bit.

I'll assume that you already know a fair bit about neural nets. That's because we won't talk about much of the background of how neural nets work; there's a few of good books and videos for that, like the Neural Networks and Deep Learning online book. Alec Radford's talk Deep Learning with Python's Theano library is a great quick introduction. Make sure you also check out Andrej Karpathy's mind-blowing ConvNetJS Browser Demos.

Prerequisites

You don't need to type the code and execute it yourself if you just want to follow along. But here's the installation instructions for those who have access to a CUDA-capable GPU and want to run the experiments themselves.

I assume you have Python 2.7.x, numpy, pandas, matplotlib, and scikit-learn installed. Lasagne is still waiting for its first proper release, so for now we'll install it straight from Github. To install Lasagne and all the remaining dependencies, run these commands:

pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt
pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements-2.txt

(Note that for sake of brevity, I'm not including commands to create a virtualenv and activate it. But you should.)

If everything worked well, you should be able to find the src/lasagne/examples/ directory in your virtualenv and run the MNIST example. This is sort of the "Hello, world" of neural nets. There's ten classes, one for each digit between 0 and 9, and the input is grayscale images of handwritten digits of size 28x28.

cd src/lasagne/examples/
python mnist.py

This command will start printing out stuff after thirty seconds or so. The reason it takes a while is that Lasagne uses Theano to do the heavy lifting; Theano in turn is a "optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python," and it will generate C code that needs to be compiled before training can happen. Luckily, we have to pay the price for this overhead only on the first run.

Once training starts, you'll see output like this:

Epoch 1 of 500
  training loss:            1.352731
  validation loss:          0.466565
  validation accuracy:              87.70 %
Epoch 2 of 500
  training loss:            0.591704
  validation loss:          0.326680
  validation accuracy:              90.64 %
Epoch 3 of 500
  training loss:            0.464022
  validation loss:          0.275699
  validation accuracy:              91.98 %
...

If you let it run long enough, you'll notice that after about 75 epochs, it'll have reached a test accuracy of around 98%.

(If any of the instructions in this tutorial do not work for you, submit a bug report here.)

The data

The training dataset for the Facial Keypoint Detection challenge consists of 7,049 96x96 gray-scale images. For each image, we're supposed learn to find the correct position (the x and y coordinates) of 15 keypoints, such as left_eye_center, right_eye_outer_corner, mouth_center_bottom_lip, and so on.

https://kaggle2.blob.core.windows.net/competitions/kaggle/3486/media/face1_with_keypoints.png

An example of one of the faces with three keypoints marked.

An interesting twist with the dataset is that for some of the keypoints we only have about 2,000 labels, while other keypoints have more than 7,000 labels available for training.

Let's write some Python code that loads the data from the CSV files provided. We'll write a function that can load both the training and the test data. These two datasets differ in that the test data doesn't contain the target values; it's the goal of the challenge to predict these. Here's our load() function:

# file kfkd.py
import os

import numpy as np
from pandas.io.parsers import read_csv
from sklearn.utils import shuffle


FTRAIN = '~/data/kaggle-facial-keypoint-detection/training-cleaned.csv'
FTEST = '~/data/kaggle-facial-keypoint-detection/test.csv'


def load(test=False, cols=None):
    """Loads data from FTEST if *test* is True, otherwise from FTRAIN.
    Pass a list of *cols* if you're only interested in a subset of the
    target columns.
    """
    fname = FTEST if test else FTRAIN
    df = read_csv(os.path.expanduser(fname))  # load pandas dataframe

    # The Image column has pixel values separated by space; convert
    # the values to numpy arrays:
    df['Image'] = df['Image'].apply(lambda im: np.fromstring(im, sep=' '))

    if cols:  # get a subset of columns
        df = df[list(cols) + ['Image']]

    print(df.count())  # prints the number of values for each column
    df = df.dropna()  # drop all rows that have missing values in them

    X = np.vstack(df['Image'].values) / 255.  # scale pixel values to [0, 1]
    X = X.astype(np.float32)

    if not test:  # only FTRAIN has any target columns
        y = df[df.columns[:-1]].values
        y = (y - 48) / 48  # scale target coordinates to [-1, 1]
        X, y = shuffle(X, y, random_state=42)  # shuffle train data
        y = y.astype(np.float32)
    else:
        y = None

    return X, y


X, y = load()
print("X.shape == {}; X.min == {:.3f}; X.max == {:.3f}".format(
    X.shape, X.min(), X.max()))
print("y.shape == {}; y.min == {:.3f}; y.max == {:.3f}".format(
    y.shape, y.min(), y.max()))

It's not necessary that you go through every single detail of this function. But let's take a look at what the script above outputs:

$ python kfkd.py
left_eye_center_x            7034
left_eye_center_y            7034
right_eye_center_x           7032
right_eye_center_y           7032
left_eye_inner_corner_x      2266
left_eye_inner_corner_y      2266
left_eye_outer_corner_x      2263
left_eye_outer_corner_y      2263
right_eye_inner_corner_x     2264
right_eye_inner_corner_y     2264
...
mouth_right_corner_x         2267
mouth_right_corner_y         2267
mouth_center_top_lip_x       2272
mouth_center_top_lip_y       2272
mouth_center_bottom_lip_x    7014
mouth_center_bottom_lip_y    7014
Image                        7044
dtype: int64
X.shape == (2140, 9216); X.min == 0.000; X.max == 1.000
y.shape == (2140, 30); y.min == -0.920; y.max == 0.996

First it's printing a list of all columns in the CSV file along with the number of available values for each. So while we have an Image for all rows in the training data, we only have 2,267 values for mouth_right_corner_x and so on.

load() returns a tuple (X, y) where y is the target matrix. y has shape n x m with n being the number of samples in the dataset that have all m keypoints. Dropping all rows with missing values is what this line does:

df = df.dropna()  # drop all rows that have missing values in them

The script's output y.shape == (2140, 30) tells us that there's only 2,140 images in the dataset that have all 30 target values present. Initially, we'll train with these 2,140 samples only. Which leaves us with many more input dimensions (9,216) than samples; an indicator that overfitting might become a problem. Let's see. Of course it's a bad idea to throw away 70% of the training data just like that, and we'll talk about this later on.

Another feature of the load() function is that it scales the intensity values of the image pixels to be in the interval [0, 1], instead of 0 to 255. The target values (x and y coordinates) are scaled to [-1, 1]; before they were between 0 to 95.

First model: a single hidden layer

Now that we're done with the legwork of loading the data, let's use Lasagne and create a neural net with a single hidden layer. We'll start with the code:

# add to kfkd.py
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet

net1 = NeuralNet(
    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    # layer parameters:
    input_shape=(128, 9216),  # 128 images per batch times 96x96 input pixels
    hidden_num_units=100,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=30,  # 30 target values

    # optimization method:
    upate=nesterov_momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=True,  # flag to indicate we're dealing with regression problem
    max_epochs=400,  # we want to train this many epochs
    verbose=1,
    )

X, y = load()
net1.fit(X, y)

We use quite a few parameters to initialize the NeuralNet. Let's walk through them. First there's the three layers and their parameters:

    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    # layer parameters:
    input_shape=(128, 9216),  # 128 images per batch times 96x96 input pixels
    hidden_num_units=100,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=30,  # 30 target values

Here we define the input layer, the hidden layer and the output layer. In parameter layers, we name and specify the type of each layer, and their order. Parameters input_shape, hidden_num_units, output_nonlinearity, and output_num_units are each parameters for specific layers; they refer to the layer by their prefix, such that input_shape defines the shape parameter of the input layer, hidden_num_units defines the hidden layer's num_units and so on. (It may seem a little odd that we have to specify the parameters like this, but the upshot is it buys us better compatibility with scikit-learn's pipeline and parameter search features.)

We'll discuss batch iterators later on. For now you'll have to be aware that we use mini batches with 128 samples in each batch, and we define the first dimension of input_shape accordingly.

We set the output_nonlinearity to None explicitly. Thus, the output units' activations become just a linear combination of the activations in the hidden layer.

The default nonlinearity used by DenseLayer is the rectifier, which is simply max(0, x). It's the most popular choice of activation function these days. By not explicitly setting hidden_nonlinearity, we're choosing the rectifier as the activiation function of our hidden layer.

http://danielnouri.org/media/kfkd/rectifier.png

The neural net's weights are initialized from a uniform distribution with a cleverly chosen interval. That is, Lasagne figures out this interval for us, using "Glorot-style" initialization.

There's a few more parameters. All parameters starting with update parametrize the update function, or optimization method. The update function will update the weights of our network after each batch. We'll use the nesterov_momentum gradient descent optimization method to do the job. There's a number of other methods that Lasagne implements, such as adagrad and rmsprop. We choose nesterov_momentum because it has proven to work very well for a large number of problems.

    # optimization method:
    upate=nesterov_momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,

The update_learning_rate defines how large we want the steps of the gradient descent updates to be. We'll talk a bit more about the learning_rate and momentum parameters later on. For now, it's enough to just use these "sane defaults."

http://i.imgur.com/s25RsOr.gif

Comparison of a few optimization methods (animation by Alec Radford). The star denotes the global minimum on the error surface. Notice that stochastic gradient descent (SGD) without momentum is the slowest method to converge in this example. We're using Nesterov's Accelerated Gradient Descent (NAG) throughout this tutorial.

In our definition of NeuralNet we didn't specify an objective function to minimize. There's again a default for that; for regression problems it's the mean squared error (MSE).

The last set of parameters declare that we're dealing with a regression problem (as opposed to classification), that 400 is the number of epochs we're willing to train, and that we want to print out information during training by setting verbose=1:

  regression=True,  # flag to indicate we're dealing with regression problem
  max_epochs=400,  # we want to train this many epochs
  verbose=1,

Finally, the last two lines in our script load the data, just as before, and then train the neural net with it:

X, y = load()
net1.fit(X, y)

Running these two lines will output a table that grows one row per training epoch. In each row, we'll see the current loss (MSE) on the train set and on the validation set and the ratio between the two. NeuralNet automatically splits the data provided in X into a training and a validation set, using 20% of the samples for validation. (You can adjust this ratio by overriding the eval_size=0.2 parameter.)

$ python kfkd.py
...
  InputLayer          (128, 9216)             produces    9216 outputs
  DenseLayer          (128, 100)              produces     100 outputs
  DenseLayer          (128, 30)               produces      30 outputs

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|----------------
     1  |    0.105418  |    0.031085  |     3.391261
     2  |    0.020353  |    0.019294  |     1.054894
     3  |    0.016118  |    0.016918  |     0.952734
     4  |    0.014187  |    0.015550  |     0.912363
     5  |    0.013329  |    0.014791  |     0.901199
...
   200  |    0.003250  |    0.004150  |     0.783282
   201  |    0.003242  |    0.004141  |     0.782850
   202  |    0.003234  |    0.004133  |     0.782305
   203  |    0.003225  |    0.004126  |     0.781746
   204  |    0.003217  |    0.004118  |     0.781239
   205  |    0.003209  |    0.004110  |     0.780738
...
   395  |    0.002259  |    0.003269  |     0.690925
   396  |    0.002256  |    0.003264  |     0.691164
   397  |    0.002254  |    0.003264  |     0.690485
   398  |    0.002249  |    0.003259  |     0.690303
   399  |    0.002247  |    0.003260  |     0.689252
   400  |    0.002244  |    0.003255  |     0.689606

On a reasonably fast GPU, we're able to train for 400 epochs in under a minute. Notice that the validation loss keeps improving until the end. (If you let it train longer, it will improve a little more.)

Now how good is a validation loss of 0.0032? How does it compare to the challenge's benchmark or the other entries in the leaderboard? Remember that we divided the target coordinates by 48 when we scaled them to be in the interval [-1, 1]. Thus, to calculate the root-mean-square error, as that's what's used in the challenge's leaderboard, based on our MSE loss of 0.003255, we'll take the square root and multiply by 48 again:

>>> import numpy as np
>>> np.sqrt(0.003255) * 48
2.7385251505144153

This is reasonable proxy for what our score would be on the Kaggle leaderboard; at the same time it's assuming that the subset of the data that we chose to train with follows the same distribution as the test set, which isn't really the case. My guess is that the score is good enough to earn us a top ten place in the leaderboard at the time of writing. Certainly not a bad start! (And for those of you that are crying out right now because of the lack of a proper test set: don't.)

Testing it out

The net1 object actually keeps a record of the data that it prints out in the table. We can access that record through the train_history_ attribute. Let's draw those two curves:

train_loss = np.array([i["train_loss"] for i in net1.train_history_])
valid_loss = np.array([i["valid_loss"] for i in net1.train_history_])
pyplot.plot(train_loss, linewidth=3, label="train")
pyplot.plot(valid_loss, linewidth=3, label="valid")
pyplot.grid()
pyplot.legend()
pyplot.xlabel("epoch")
pyplot.ylabel("loss")
pyplot.ylim(1e-3, 1e-2)
pyplot.yscale("log")
pyplot.show()
http://danielnouri.org/media/kfkd/lc1.png

We can see that our net overfits, but it's not that bad. In particular, we don't see a point where the validation error gets worse again, thus it doesn't appear that early stopping, a technique that's commonly used to avoid overfitting, would be very useful at this point. Notice that we didn't use any regularization whatsoever, apart from choosing a small number of neurons in the hidden layer, a setting that will keep overfitting somewhat in control.

How do the net's predictions look like, then? Let's pick a few examples from the test set and check:

def plot_sample(x, y, axis):
    img = x.reshape(96, 96)
    axis.imshow(img, cmap='gray')
    axis.scatter(y[0::2] * 48 + 48, y[1::2] * 48 + 48, marker='x', s=10)

X, _ = load(test=True)
y_pred = net1.predict(X)

fig = pyplot.figure(figsize=(6, 6))
fig.subplots_adjust(
    left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)

for i in range(16):
    ax = fig.add_subplot(4, 4, i + 1, xticks=[], yticks=[])
    plot_sample(X[i], y_pred[i], ax)

pyplot.show()
http://danielnouri.org/media/kfkd/samples1.png

Our first model's predictions on 16 samples taken from the test set.

The predictions look reasonable, but sometimes they are quite a bit off. Let's try and do a bit better.

Second model: convolutions

http://deeplearning.stanford.edu/wiki/images/6/6c/Convolution_schematic.gif

The convolution operation. (Animation taken from the Stanford deep learning tutorial.)

LeNet5-style convolutional neural nets are at the heart of deep learning's recent breakthrough in computer vision. Convolutional layers are different to fully connected layers; they use a few tricks to reduce the number of parameters that need to be learned, while retaining high expressiveness. These are:

  • local connectivity: neurons are connected only to a subset of neurons in the previous layer,
  • weight sharing: weights are shared between a subset of neurons in the convolutional layer (these neurons form what's called a feature map),
  • pooling: static subsampling of inputs.
http://deeplearning.net/tutorial/_images/conv_1D_nn.png

Illustration of local connectivity and weight sharing. (Taken from the deeplearning.net tutorial.)

Units in a convolutional layer actually connect to a 2-d patch of neurons in the previous layer, a prior that lets them exploit the 2-d structure in the input.

When using convolutional layers in Lasagne, we have to prepare the input data such that each sample is no longer a flat vector of 9,216 pixel intensities, but a three-dimensional matrix with shape (c, 0, 1), where c is the number of channels (colors), and 0 and 1 correspond to the x and y dimensions of the input image. In our case, the concrete shape will be (1, 96, 96), because we're dealing with a single (gray) color channel only.

A function load2d that wraps the previously written load and does the necessary transformations is easily coded:

def load2d(test=False, cols=None):
    X, y = load(test=test)
    X = X.reshape(-1, 96, 96, 1)
    X = X.transpose(0, 3, 1, 2)
    return X, y

We'll build a convolutional neural net with three convolutional layers and two fully connected layers. Each conv layer is followed by a 2x2 max-pooling layer. Starting with 32 filters, we double the number of filters with every conv layer. The densely connected hidden layers both have 500 units.

There's again no regularization in the form of weight decay or dropout. It turns out that using very small convolutional filters, such as our 3x3 and 2x2 filters, is again a pretty good regularizer by itself.

Let's write down the code:

# use the cuda-convnet implementations of conv and max-pool layer
Conv2DLayer = layers.cuda_convnet.Conv2DCCLayer
MaxPool2DLayer = layers.cuda_convnet.MaxPool2DCCLayer

net2 = NeuralNet(
    layers=[
        ('input', layers.InputLayer),
        ('conv1', Conv2DLayer),
        ('pool1', MaxPool2DLayer),
        ('conv2', Conv2DLayer),
        ('pool2', MaxPool2DLayer),
        ('conv3', Conv2DLayer),
        ('pool3', MaxPool2DLayer),
        ('hidden4', layers.DenseLayer),
        ('hidden5', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    input_shape=(128, 1, 96, 96),
    conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_ds=(2, 2),
    conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_ds=(2, 2),
    conv3_num_filters=128, conv3_filter_size=(2, 2), pool3_ds=(2, 2),
    hidden4_num_units=500, hidden5_num_units=500,
    output_num_units=30, output_nonlinearity=None,

    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=True,
    max_epochs=1000,
    verbose=1,
    )

X, y = load2d()  # load 2-d data
net2.fit(X, y)

# Training for 1000 epochs will take a while.  We'll pickle the
# trained model so that we can load it back later:
import cPickle as pickle
with open('net2.pickle', 'wb') as f:
    pickle.dump(net2, f, -1)

Training this neural net is much more computationally costly than the first one we trained. It takes around 15x as long to train; those 1000 epochs take more than 20 minutes on even a powerful GPU.

However, the patient is rewarded with what's already a much better model than the one we had before. Let's take a look at the output when running the script. First comes the list of layers with their output shapes. Note that the first conv layer produces 32 output images of size (94, 94), that's one 94x94 output image per filter:

InputLayer            (128, 1, 96, 96)        produces    9216 outputs
Conv2DCCLayer         (128, 32, 94, 94)       produces  282752 outputs
MaxPool2DCCLayer      (128, 32, 47, 47)       produces   70688 outputs
Conv2DCCLayer         (128, 64, 46, 46)       produces  135424 outputs
MaxPool2DCCLayer      (128, 64, 23, 23)       produces   33856 outputs
Conv2DCCLayer         (128, 128, 22, 22)      produces   61952 outputs
MaxPool2DCCLayer      (128, 128, 11, 11)      produces   15488 outputs
DenseLayer            (128, 500)              produces     500 outputs
DenseLayer            (128, 500)              produces     500 outputs
DenseLayer            (128, 30)               produces      30 outputs

What follows is the same table that we saw with the first example, with train and validation error over time:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|----------------
     1  |    0.111763  |    0.042740  |     2.614934
     2  |    0.018500  |    0.009413  |     1.965295
     3  |    0.008598  |    0.007918  |     1.085823
     4  |    0.007292  |    0.007284  |     1.001139
     5  |    0.006783  |    0.006841  |     0.991525
...
   500  |    0.001791  |    0.002013  |     0.889810
   501  |    0.001789  |    0.002011  |     0.889433
   502  |    0.001786  |    0.002009  |     0.889044
   503  |    0.001783  |    0.002007  |     0.888534
   504  |    0.001780  |    0.002004  |     0.888095
   505  |    0.001777  |    0.002002  |     0.887699
...
   995  |    0.001083  |    0.001568  |     0.690497
   996  |    0.001082  |    0.001567  |     0.690216
   997  |    0.001081  |    0.001567  |     0.689867
   998  |    0.001080  |    0.001567  |     0.689595
   999  |    0.001080  |    0.001567  |     0.689089
  1000  |    0.001079  |    0.001566  |     0.688874

Quite a nice improvement over the first network. Our RMSE is looking pretty good, too:

>>> np.sqrt(0.001566) * 48
1.8994904579913006

We can compare the predictions of the two networks using one of the more problematic samples in the test set:

sample1 = load(test=True)[0][6:7]
sample2 = load2d(test=True)[0][6:7]
y_pred1 = net1.predict(sample1)[0]
y_pred2 = net2.predict(sample2)[0]

fig = pyplot.figure(figsize=(6, 3))
ax = fig.add_subplot(1, 2, 1, xticks=[], yticks=[])
plot_sample(sample1[0], y_pred1, ax)
ax = fig.add_subplot(1, 2, 2, xticks=[], yticks=[])
plot_sample(sample1[0], y_pred2, ax)
pyplot.show()
http://danielnouri.org/media/kfkd/samples2.png

The predictions of net1 on the left compared to the predictions of net2.

And then let's compare the learning curves of the first and the second network:

http://danielnouri.org/media/kfkd/lc2.png

This looks pretty good, I like the smoothness of the new error curves. But we do notice that towards the end, the validation error of net2 flattens out much quicker than the training error. I bet we could improve that by using more training examples. What if we flipped the input images horizontically; would we be able to improve training by doubling the amount of training data this way?

Data augmentation

An overfitting net can generally be made to perform better by using more training data. (And if your unregularized net does not overfit, you should probably make it larger.)

Data augmentation lets us artificially increase the number of training examples by applying transformations, adding noise etc. That's obviously more economic than having to go out and collect more examples by hand. Augmentation is a very useful tool to have in your deep learning toolbox.

We mentioned batch iterators already briefly. It is the batch iterator's job to take a matrix of samples, and split it up in batches, in our case of size 128. While it does the splitting, the batch iterator can also apply transformations to the data on the fly. So to produce those horizontal flips, we don't actually have to double the amount of training data in the input matrix. Rather, we will just perform the horizontal flips with 50% chance while we're iterating over the data. This is convenient, and for some problems it allows us to produce an infinite number of examples, without blowing up the memory usage. Also, transformations to the input images can be done while the GPU is busy processing a previous batch, so they come at virtually no cost.

Flipping the images horizontically is just a matter of using slicing:

X, y = load2d()
X_flipped = X[:, :, :, ::-1]  # simple slice to flip all images

# plot two images:
fig = pyplot.figure(figsize=(6, 3))
ax = fig.add_subplot(1, 2, 1, xticks=[], yticks=[])
plot_sample(X[1], y[1], ax)
ax = fig.add_subplot(1, 2, 2, xticks=[], yticks=[])
plot_sample(X_flipped[1], y[1], ax)
pyplot.show()
http://danielnouri.org/media/kfkd/samples3.png

Left shows the original image, right is the flipped image.

In the picture on the right, notice that the target value keypoints aren't aligned with the image anymore. Since we're flipping the images, we'll have to make sure we also flip the target values. To do this, not only do we have to flip the coordinates, we'll also have to swap target value positions; that's because the flipped left_eye_center_x no longer points to the left eye in our flipped image; now it corresponds to right_eye_center_x. Some points like nose_tip_y are not affected. We'll define a tuple flip_indices that holds the information about which columns in the target vector need to swap places when we flip the image horizontically. Remember the list of columns was:

left_eye_center_x            7034
left_eye_center_y            7034
right_eye_center_x           7032
right_eye_center_y           7032
left_eye_inner_corner_x      2266
left_eye_inner_corner_y      2266
...

Since left_eye_center_x will need to swap places with right_eye_center_x, we write down the tuple (0, 2). Also left_eye_center_y needs to swap places: with right_eye_center_y. Thus we write down (1, 3), and so on. In the end, we have:

flip_indices = [
    (0, 2), (1, 3),
    (4, 8), (5, 9), (6, 10), (7, 11),
    (12, 16), (13, 17), (14, 18), (15, 19),
    (22, 24), (23, 25),
    ]

# Let's see if we got it right:
df = read_csv(os.path.expanduser(FTRAIN))
for i, j in flip_indices:
    print("# {} -> {}".format(df.columns[i], df.columns[j]))

# this prints out:
# left_eye_center_x -> right_eye_center_x
# left_eye_center_y -> right_eye_center_y
# left_eye_inner_corner_x -> right_eye_inner_corner_x
# left_eye_inner_corner_y -> right_eye_inner_corner_y
# left_eye_outer_corner_x -> right_eye_outer_corner_x
# left_eye_outer_corner_y -> right_eye_outer_corner_y
# left_eyebrow_inner_end_x -> right_eyebrow_inner_end_x
# left_eyebrow_inner_end_y -> right_eyebrow_inner_end_y
# left_eyebrow_outer_end_x -> right_eyebrow_outer_end_x
# left_eyebrow_outer_end_y -> right_eyebrow_outer_end_y
# mouth_left_corner_x -> mouth_right_corner_x
# mouth_left_corner_y -> mouth_right_corner_y

Our batch iterator implementation will derive from the default BatchIterator class and override the transform() method only. Let's see how it looks like when we put it all together:

class FlipBatchIterator(BatchIterator):
    flip_indices = [
        (0, 2), (1, 3),
        (4, 8), (5, 9), (6, 10), (7, 11),
        (12, 16), (13, 17), (14, 18), (15, 19),
        (22, 24), (23, 25),
        ]

    def transform(self, Xb, yb):
        Xb, yb = super(FlipBatchIterator, self).transform(Xb, yb)

        # Don't flip images if we're in 'test' mode:
        if not self.test:
            # Flip half of the images in this batch at random:
            bs = Xb.shape[0]
            indices = np.random.choice(bs, bs / 2, replace=False)
            Xb[indices] = Xb[indices, :, :, ::-1]

            if yb is not None:
                # Horizontal flip of all x coordinates:
                yb[indices, ::2] = yb[indices, ::2] * -1

                # Swap places, e.g. left_eye_center_x -> right_eye_center_x
                for a, b in self.flip_indices:
                    yb[indices, a], yb[indices, b] = (
                        yb[indices, b], yb[indices, a])

        return Xb, yb

To use this batch iterator for training, we'll pass it as the batch_iterator argument to NeuralNet. Let's define net3, a network that looks exactly the same as net2 except for these lines at the very end:

net3 = NeuralNet(
    # ...
    regression=True,
    batch_iterator=FlipBatchIterator(batch_size=128),
    max_epochs=3000,
    verbose=1,
    )

Now we're passing our FlipBatchIterator, but we've also tripled the number of epochs to train. While each one of our training epochs will still look at the same number of examples as before (after all, we haven't changed the size of X), it turns out that training nevertheless takes quite a bit longer when we use our transforming FlipBatchIterator. This is because what the network learns generalizes better this time, and it's arguably harder to learn things that generalize than to overfit.

So this will take maybe take an hour to train. Let's make sure we pickle the model at the end of training, and then we're ready to go fetch some tea and biscuits. Or maybe do the laundry:

net3.fit(X, y)

import cPickle as pickle
with open('net3.pickle', 'wb') as f:
    pickle.dump(net3, f, -1)
$ python kfkd.py
...
 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|----------------
...
   500  |    0.002238  |    0.002303  |     0.971519
...
  1000  |    0.001365  |    0.001623  |     0.841110
  1500  |    0.001067  |    0.001457  |     0.732018
  2000  |    0.000895  |    0.001369  |     0.653721
  2500  |    0.000761  |    0.001320  |     0.576831
  3000  |    0.000678  |    0.001288  |     0.526410

Comparing the learning with that of net2, we notice that the error on the validation set after 3000 epochs is indeed about 5% smaller for the data augmented net. We can see how net2 stops learning anything useful after 2000 or so epochs, and gets pretty noisy, while net3 continues to improve its validation error throughout, though slowly.

http://danielnouri.org/media/kfkd/lc3.png

Still seems like a lot of work for only a small gain? We'll find out if it was worth it in the next secion.

Changing learning rate and momentum over time

What's annoying about our last model is that it took already an hour to train it, and it's not exactly inspiring to have to wait for your experiment's results for so long. In this section, we'll talk about a combination of two tricks to fix that and make the net train much faster again.

An intuition behind starting with a higher learning rate and decreasing it during the course of training is this: As we start training, we're far away from the optimum, and we want to take big steps towards it and learn quickly. But the closer we get to the optimum, the lighter we want to step. It's like taking the train home, but when you enter your door you do it by foot, not by train.

On the importance of initialization and momentum in deep learning is the title of a talk and a paper by Ilya Sutskever et al. It's there that we learn about another useful trick to boost deep learning: namely increasing the optimization method's momentum parameter during training.

Remember that in our previous model, we initialized learning rate and momentum with a static 0.01 and 0.9 respectively. Let's change that such that the learning rate decreases linearly with the number of epochs, while we let the momentum increase.

NeuralNet allows us to update parameters during training using the on_epoch_finished hook. We can pass a function to on_epoch_finished and it'll be called whenever an epoch is finished. However, before we can assign new values to update_learning_rate and update_momentum on the fly, we'll have to change these two parameters to become Theano shared variables. Thankfully, that's pretty easy:

import theano

def float32(k):
    return np.cast['float32'](k)

net4 = NeuralNet(
    # ...
    update_learning_rate=theano.shared(float32(0.03)),
    update_momentum=theano.shared(float32(0.9)),
    # ...
    )

The callback or list of callbacks that we pass will be called with two arguments: nn, which is the NeuralNet instance itself, and train_history, which is the same as nn.train_history_.

Instead of working with callback functions that use hard-coded values, we'll use a parametrizable class with a __call__ method as our callback. Let's call this class AdjustVariable. The implementation is reasonably straight-forward:

class AdjustVariable(object):
    def __init__(self, name, start=0.03, stop=0.001):
        self.name = name
        self.start, self.stop = start, stop
        self.ls = None

    def __call__(self, nn, train_history):
        if self.ls is None:
            self.ls = np.linspace(self.start, self.stop, nn.max_epochs)

        epoch = train_history[-1]['epoch']
        new_value = float32(self.ls[epoch - 1])
        getattr(nn, self.name).set_value(new_value)

Let's plug it all together now and then we're ready to start training:

net4 = NeuralNet(
    # ...
    update_learning_rate=theano.shared(float32(0.03)),
    update_momentum=theano.shared(float32(0.9)),
    # ...
    regression=True,
    # batch_iterator=FlipBatchIterator(batch_size=128),
    on_epoch_finished=[
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),
        ],
    max_epochs=3000,
    verbose=1,
    )

X, y = load2d()
net4.fit(X, y)

with open('net4.pickle', 'wb') as f:
    pickle.dump(net4, f, -1)

We'll train two nets: net4 doesn't use our FlipBatchIterator, net5 does. Other than that, they're identical.

This is the learning of net4:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|----------------
    50  |    0.004216  |    0.003996  |     1.055011
   100  |    0.003533  |    0.003382  |     1.044791
   250  |    0.001557  |    0.001781  |     0.874249
   500  |    0.000915  |    0.001433  |     0.638702
   750  |    0.000653  |    0.001355  |     0.481806
  1000  |    0.000496  |    0.001387  |     0.357917

Cool, training is happening much faster now! The train error at epochs 500 and 1000 is half of what it used to be in net2, before our adjustments to learning rate and momentum. This time, generalization seems to stop improving after 750 or so epochs already; looks like there's no point in training much longer.

What about net5 with the data augmentation switched on?

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|----------------
    50  |    0.004317  |    0.004081  |     1.057609
   100  |    0.003756  |    0.003535  |     1.062619
   250  |    0.001765  |    0.001845  |     0.956560
   500  |    0.001135  |    0.001437  |     0.790225
   750  |    0.000878  |    0.001313  |     0.668903
  1000  |    0.000705  |    0.001260  |     0.559591
  1500  |    0.000492  |    0.001199  |     0.410526
  2000  |    0.000373  |    0.001184  |     0.315353

And again we have much faster training than with net3, and better results. After 1000 epochs, we're better off than net3 was after 3000 epochs. What's more, the model trained with data augmentation is now about 10% better with regard to validation error than the one without.

http://danielnouri.org/media/kfkd/lc4.png

Dropout

Introduced in 2012 in the Improving neural networks by preventing co-adaptation of feature detectors paper, dropout is a popular regularization technique that works amazingly well. I won't go into the details of why it works so well, you can read about that elsewhere.

Like with any other regularization technique, dropout only makes sense if we have a network that's overfitting, which is clearly the case for the net5 network that we trained in the previous section. It's important to remember to get your net to train nicely and overfit first, then regularize.

To use dropout with Lasagne, we'll add DropoutLayer layers between the existing layers and assign dropout probabilities to each one of them. Here's the complete definition of our new net. I've added a # ! comment at the end of those lines that were added between this and net5.

net6 = NeuralNet(
    layers=[
        ('input', layers.InputLayer),
        ('conv1', Conv2DLayer),
        ('pool1', MaxPool2DLayer),
        ('dropout1', layers.DropoutLayer),  # !
        ('conv2', Conv2DLayer),
        ('pool2', MaxPool2DLayer),
        ('dropout2', layers.DropoutLayer),  # !
        ('conv3', Conv2DLayer),
        ('pool3', MaxPool2DLayer),
        ('dropout3', layers.DropoutLayer),  # !
        ('hidden4', layers.DenseLayer),
        ('dropout4', layers.DropoutLayer),  # !
        ('hidden5', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    input_shape=(128, 1, 96, 96),
    conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_ds=(2, 2),
    dropout1_p=0.1,  # !
    conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_ds=(2, 2),
    dropout2_p=0.2,  # !
    conv3_num_filters=128, conv3_filter_size=(2, 2), pool3_ds=(2, 2),
    dropout3_p=0.3,  # !
    hidden4_num_units=500,
    dropout4_p=0.5,  # !
    hidden5_num_units=500,
    output_num_units=30, output_nonlinearity=None,

    update_learning_rate=theano.shared(float32(0.03)),
    update_momentum=theano.shared(float32(0.9)),

    regression=True,
    batch_iterator=FlipBatchIterator(batch_size=128),
    on_epoch_finished=[
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),
        ],
    max_epochs=3000,
    verbose=1,
    )

Our network is sufficiently large now to crash Python's pickle with a maximum recursion error. Therefore we have to increase Python's recursion limit before we save it:

import sys
sys.setrecursionlimit(10000)

X, y = load2d()
net6.fit(X, y)

import cPickle as pickle
with open('net6.pickle', 'wb') as f:
    pickle.dump(net6, f, -1)

Taking a look at the learning, we notice that it's become slower again, and that's expected with dropout, but eventually it will outperform net5:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|---------------
    50  |    0.004619  |    0.005198  |     0.888566
   100  |    0.004369  |    0.004182  |     1.044874
   250  |    0.003821  |    0.003577  |     1.068229
   500  |    0.002598  |    0.002236  |     1.161854
  1000  |    0.001902  |    0.001607  |     1.183391
  1500  |    0.001660  |    0.001383  |     1.200238
  2000  |    0.001496  |    0.001262  |     1.185684
  2500  |    0.001383  |    0.001181  |     1.171006
  3000  |    0.001306  |    0.001121  |     1.164100

Also overfitting doesn't seem to be nearly as bad. Though we'll have to be careful with those numbers: the ratio between training and validation has a slightly different meaning now since the train error is evaluated with dropout, whereas the validation error is evaluated without dropout. A more comparable value for the train error is this:

from sklearn.metrics import mean_squared_error
print mean_squared_error(net6.predict(X), y)
# prints something like 0.0010073791

In our previous model without dropout, the error on the train set was 0.000373. So not only does our dropout net perform slightly better, it overfits much less than what we had before. That's great news, because it means that we can expect even better performance when we make the net larger (and more expressive). And that's what we'll try next: we increase the number of units in the last two hidden layers from 500 to 1000. Update these lines:

net7 = NeuralNet(
    # ...
    hidden4_num_units=1000,  # !
    dropout4_p=0.5,
    hidden5_num_units=1000,  # !
    # ...
    )

The improvement over the non-dropout layer is now becoming more substantial:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|---------------
    50  |    0.004756  |    0.007043  |     0.675330
   100  |    0.004440  |    0.005321  |     0.834432
   250  |    0.003974  |    0.003928  |     1.011598
   500  |    0.002574  |    0.002347  |     1.096366
  1000  |    0.001861  |    0.001613  |     1.153796
  1500  |    0.001558  |    0.001372  |     1.135849
  2000  |    0.001409  |    0.001230  |     1.144821
  2500  |    0.001295  |    0.001146  |     1.130188
  3000  |    0.001195  |    0.001087  |     1.099271

And we're still looking really good with the overfitting! My feeling is that if we increase the number of epochs to train, this model might become even better. Let's try it:

net12 = NeuralNet(
    # ...
    max_epochs=10000,
    # ...
    )
 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|---------------
    50  |    0.004756  |    0.007027  |     0.676810
   100  |    0.004439  |    0.005321  |     0.834323
   500  |    0.002576  |    0.002346  |     1.097795
  1000  |    0.001863  |    0.001614  |     1.154038
  2000  |    0.001406  |    0.001233  |     1.140188
  3000  |    0.001184  |    0.001074  |     1.102168
  4000  |    0.001068  |    0.000983  |     1.086193
  5000  |    0.000981  |    0.000920  |     1.066288
  6000  |    0.000904  |    0.000884  |     1.021837
  7000  |    0.000851  |    0.000849  |     1.002314
  8000  |    0.000810  |    0.000821  |     0.985769
  9000  |    0.000769  |    0.000803  |     0.957842
 10000  |    0.000760  |    0.000787  |     0.966583

So there you're witnessing the magic that is dropout. :-)

Let's compare the nets we trained so for and their respective train and validation errors:

 Name  |   Description    |  Epochs  |  Train loss  |  Valid loss
-------|------------------|----------|--------------|--------------
 net1  |  single hidden   |     400  |    0.002244  |    0.003255
 net2  |  convolutions    |    1000  |    0.001079  |    0.001566
 net3  |  augmentation    |    3000  |    0.000678  |    0.001288
 net4  |  mom + lr adj    |    1000  |    0.000496  |    0.001387
 net5  |  net4 + augment  |    2000  |    0.000373  |    0.001184
 net6  |  net5 + dropout  |    3000  |    0.001306  |    0.001121
 net7  |  net6 + epochs   |   10000  |    0.000760  |    0.000787

Training specialists

Remember those 70% of training data that we threw away in the beginning? Turns out that's a very bad idea if we want to get a competitive score in the Kaggle leaderboard. There's quite a bit of variance in those 70% of data and in the challenge's test set that our model hasn't seen yet.

So instead of training a single model, let's train a few specialists, with each one predicting a different set of target values. We'll train one model that only predicts left_eye_center and right_eye_center, one only for nose_tip and so on; overall, we'll have six models. This will allow us to use the full training dataset, and hopefully get a more competitive score overall.

The six specialists are all going to use exactly the same network architecture (a simple approach, not necessarily the best). Because training is bound to take much longer now than before, let's think about a strategy so that we don't have to wait for max_epochs to finish, even if the validation error stopped improving much earlier. This is called early stopping, and we'll write another on_epoch_finished callback to take care of that. Here's the implementation:

class EarlyStopping(object):
    def __init__(self, patience=100):
        self.patience = patience
        self.best_valid = np.inf
        self.best_valid_epoch = 0
        self.best_weights = None

    def __call__(self, nn, train_history):
        current_valid = train_history[-1]['valid_loss']
        current_epoch = train_history[-1]['epoch']
        if current_valid < self.best_valid:
            self.best_valid = current_valid
            self.best_valid_epoch = current_epoch
            self.best_weights = [w.get_value() for w in nn.get_all_params()]
        elif self.best_valid_epoch + self.patience < current_epoch:
            print("Early stopping.")
            print("Best valid loss was {:.6f} at epoch {}.".format(
                self.best_valid, self.best_valid_epoch))
            nn.load_weights_from(self.best_weights)
            raise StopIteration()

You can see that there's two branches inside the __call__: the first where the current validation score is better than what we've previously seen, and the second where the best validation epoch was more than self.patience epochs in the past. In the first case we store away the weights:

          self.best_weights = [w.get_value() for w in nn.get_all_params()]

In the second case, we set the weights of the network back to those best_weights before raising StopIteration, signalling to NeuralNet that we want to stop training.

          nn.load_weights_from(self.best_weights)
          raise StopIteration()

Let's update the list of on_epoch_finished handlers in our net's definition and use EarlyStopping:

net8 = NeuralNet(
    # ...
    on_epoch_finished=[
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),
        EarlyStopping(patience=200),
        ],
    # ...
    )

So far so good, but how would we go about defining those specialists and what they should each predict? Let's make a list for that:

SPECIALIST_SETTINGS = [
    dict(
        columns=(
            'left_eye_center_x', 'left_eye_center_y',
            'right_eye_center_x', 'right_eye_center_y',
            ),
        flip_indices=((0, 2), (1, 3)),
        ),

    dict(
        columns=(
            'nose_tip_x', 'nose_tip_y',
            ),
        flip_indices=(),
        ),

    dict(
        columns=(
            'mouth_left_corner_x', 'mouth_left_corner_y',
            'mouth_right_corner_x', 'mouth_right_corner_y',
            'mouth_center_top_lip_x', 'mouth_center_top_lip_y',
            ),
        flip_indices=((0, 2), (1, 3)),
        ),

    dict(
        columns=(
            'mouth_center_bottom_lip_x',
            'mouth_center_bottom_lip_y',
            ),
        flip_indices=(),
        ),

    dict(
        columns=(
            'left_eye_inner_corner_x', 'left_eye_inner_corner_y',
            'right_eye_inner_corner_x', 'right_eye_inner_corner_y',
            'left_eye_outer_corner_x', 'left_eye_outer_corner_y',
            'right_eye_outer_corner_x', 'right_eye_outer_corner_y',
            ),
        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),
        ),

    dict(
        columns=(
            'left_eyebrow_inner_end_x', 'left_eyebrow_inner_end_y',
            'right_eyebrow_inner_end_x', 'right_eyebrow_inner_end_y',
            'left_eyebrow_outer_end_x', 'left_eyebrow_outer_end_y',
            'right_eyebrow_outer_end_x', 'right_eyebrow_outer_end_y',
            ),
        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),
        ),
    ]

We already discussed the need for flip_indices in the Data augmentation section. Remember from section The data that our load_data() function takes an optional list of columns to extract. We'll make use of this feature when we fit the specialist models in a new function fit_specialists():

from collections import OrderedDict
from sklearn.base import clone

def fit_specialists():
    specialists = OrderedDict()

    for setting in SPECIALIST_SETTINGS:
        cols = setting['columns']
        X, y = load2d(cols=cols)

        model = clone(net)
        model.output_num_units = y.shape[1]
        model.batch_iterator.flip_indices = setting['flip_indices']
        # set number of epochs relative to number of training examples:
        model.max_epochs = int(1e7 / y.shape[0])
        if 'kwargs' in setting:
            # an option 'kwargs' in the settings list may be used to
            # set any other parameter of the net:
            vars(model).update(setting['kwargs'])

        print("Training model for columns {} for {} epochs".format(
            cols, model.max_epochs))
        model.fit(X, y)
        specialists[cols] = model

    with open('net-specialists.pickle', 'wb') as f:
        # we persist a dictionary with all models:
        pickle.dump(specialists, f, -1)

There's nothing too spectacular happening here. Instead of training and persisting a single model, we do it with a list of models that are saved in a dictionary that maps columns to the trained NeuralNet instances. Now despite our early stopping, this will still take forever to train (though by forever I don't mean Google-forever, I mean maybe half a day on a single GPU); I don't recommend that you actually run this.

We could of course easily parallelize training these specialist nets across GPUs, but maybe you don't have the luxury of access to a box with multiple CUDA GPUs. In the next section we'll talk about another way to cut down on training time. But let's take a look at the results of fitting these expensive to train specialists first:

http://danielnouri.org/media/kfkd/lc5.png

Learning curves for six specialist models. The solid lines represent RMSE on the validation set, the dashed lines errors on the train set. mean is the mean validation error of all nets weighted by number of target values. All curves have been scaled to have the same length on the x axis.

Lastly, this solution gives us a Kaggle leaderboard score of 2.17 RMSE, which corresponds to the second place at the time of writing (right behind yours truly).

Supervised pre-training

In the last section of this tutorial, we'll discuss a way to make training our specialists faster. The idea is this: instead of initializing the weights of each specialist network at random, we'll initialize them with the weights that were learned in net6 or net7. Remember from our EarlyStopping implementation that copying weights from one network to another is as simple as using the load_weights_from() method. Let's modify the fit_specialists method to do just that. I'm again marking the lines that changed compared to the previous implementation with a # ! comment:

def fit_specialists(fname_pretrain=None):
    if fname_pretrain:  # !
        with open(fname_pretrain, 'rb') as f:  # !
            net_pretrain = pickle.load(f)  # !
    else:  # !
        net_pretrain = None  # !

    specialists = OrderedDict()

    for setting in SPECIALIST_SETTINGS:
        cols = setting['columns']
        X, y = load2d(cols=cols)

        model = clone(net)
        model.output_num_units = y.shape[1]
        model.batch_iterator.flip_indices = setting['flip_indices']
        model.max_epochs = int(4e6 / y.shape[0])
        if 'kwargs' in setting:
            # an option 'kwargs' in the settings list may be used to
            # set any other parameter of the net:
            vars(model).update(setting['kwargs'])

        if net_pretrain is not None:  # !
            # if a pretrain model was given, use it to initialize the
            # weights of our new specialist model:
            model.load_weights_from(net_pretrain)  # !

        print("Training model for columns {} for {} epochs".format(
            cols, model.max_epochs))
        model.fit(X, y)
        specialists[cols] = model

    with open('net-specialists.pickle', 'wb') as f:
        # this time we're persisting a dictionary with all models:
        pickle.dump(specialists, f, -1)

It turns out that initializing those nets not at random, but by re-using weights from one of the networks we learned earlier has in fact two big advantages: One is that training converges much faster; maybe four times faster in this case. The second advantage is that it also helps get better generalization; pre-training acts as a regularizer. Here's the same learning curves as before, but now for the pre-trained nets:

http://danielnouri.org/media/kfkd/lc6.png

Learning curves for six specialist models that were pre-trained.

Finally, the score for this solution on the challenge's leaderboard is 2.13 RMSE. Again the second place, but getting closer!

Conclusion

There's probably a dozen ideas that you have that you want to try out. You can find the source code for the final solution here to download and play around with. It also includes the bit that generates a submission file for the Kaggle challenge. Run python kfkd.py to find out how to use the script on the command-line.

Here's a couple of the more obvious things that you could try out at this point: Try optimizing the parameters for the individual specialist networks; this is something that we haven't done so far. Observe that the six nets that we trained all have different levels of overfitting. If they're not or hardly overfitting, like for the green and the yellow net above, you could try to decrease the amount of dropout. Likewise, if it's overfitting badly, like the black and purple nets, you could try increasing the amount of dropout. In the definition of SPECIALIST_SETTINGS we can already add some net-specific settings; so say we wanted to add more regularization to the second net, then we could change the second entry of the list to look like so:

    dict(
        columns=(
            'nose_tip_x', 'nose_tip_y',
            ),
        flip_indices=(),
        kwargs=dict(dropout2_p=0.3, dropout3_p=0.4),  # !
        ),

And there's a ton of other things that you could try to tweak. Maybe you'll try adding another convolutional or fully connected layer? I'm curious to hear about improvements that you're able to come up with in the comments.


Daniel Nouri is the founder of Natural Vision, a company that builds cutting edge machine learning solutions.

December 17, 2014 09:15 AM

December 16, 2014


Graham Dumpleton

Launching applications in Docker containers.

So far in this current series of blog posts I introduced the Docker image I have created for hosting Python WSGI applications using Apache/mod_wsgi. I then went on to explain what happens when you build your own image derived from it which incorporates your specific Python web application. In this blog post I am going to explain what happens when you run the image and how your Python web

December 16, 2014 11:34 PM


Will Kahn-Greene

Dennis v0.6 released! Line numbers, double vowels, better cli-fu, and better output!

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files like invalid Python variable syntax which leads to exceptions
  • a template linter for finding problems in strings in .pot files that make translator's lives difficult
  • a statuser for seeing the high-level translation/error status of your .po files
  • a translator for strings in your .po files to make development easier

v0.6 released!

Since v0.5, I've done the following:

  • Rewrote the command line handling using click and added an exception handler.
  • Merged the lint and linttemplate commands. Why should you care which file you're linting when the linter can figure it out for you?
  • Added the whimsical double vowel transform.
  • Added line numbers in the lint output. This will make it possible to find those pesky problematic strings in your .po/.pot files.
  • Add a line reporter to the linter.

Getting pretty close to what I want for a 1.0, so I'm pretty excited about this version.

Denise update

I've updated Denise with the latest Dennis and moved it to a better url. Lint your .po/.pot files via web service using http://denise.paas.allizom.org/.

Where to go for more

For more specifics on this release, see here: http://dennis.readthedocs.org/en/latest/changelog.html#version-0-6-december-16th-2014

Documentation and quickstart here: http://dennis.readthedocs.org/en/v0.6/

Source code and issue tracker here: https://github.com/willkg/dennis

Source code and issue tracker for Denise (Dennis-as-a-service): https://github.com/willkg/denise

6 out of 8 employees said Dennis helps them complete 1.5 more deliverables per quarter.

December 16, 2014 10:22 PM


PyCharm

Announcing the PyCharm 4.0.3 release update

Today we’re happy to announce that the PyCharm 4.0.3 bug-fix update has been uploaded and is now available from the download page. It also will be available in short time as a patch update from the previous versions of PyCharm 4.x.

This update includes the same set of major changes and fixes as the PyCharm 4.0.3 RC build. As a recap, some notable highlights of this release include:

For further details on the bug fixes and changes, please consult the Release Notes.

As usual, please report any problem you found in the issue tracker.

If you would like to discuss your experiences with PyCharm, we look forward to your feedback on our PyCharm Facebook page and twitter.

Develop with Pleasure!
-PyCharm team

December 16, 2014 04:47 PM


PyCon

PyCon 2015 Tutorial Schedule Announced

Tutorials Schedule

After a busy few months of competitive reviews, the tutorials team within our program committee has completed their process and have come up with an awesome schedule… ta da! https://us.pycon.org/2015/schedule/tutorials/
Led by Stuart Williams and Ruben Orduz, a fantastic team came together to shape this schedule, including Carol Willing, Ian Cordasco, Harry Percival, Allen Downey, Richard Jones, and Kenneth Love. Thanks to everyone for their efforts, both in reviewing and in submitting!

Register for Tutorials

On April 8 & 9, the two days preceding the conference talk dates, attendees have an opportunity to attend up to four different tutorials. Each day offers both a morning and afternoon session, each providing three hours of learning split by a snack break, with lunch in between the sessions. Our instructors come from a variety of backgrounds, including full time educators or trainers, authors, domain experts, and in a lot of cases, they've created the project they're teaching a session on.
Each tutorial costs $150 USD, which is a steal for what our instructors provide with these hands-on courses and the materials you'll get out of them. You can register for the conference and add tutorials to your existing registration profile at any time.

Accepted Talks

Over on the conference talks end of the program committee, they've recently chosen the list of talks that will make up the schedule! Work is underway to fit each of those talks into schedule format, but for now, the list of accepted proposal is available here.

December 16, 2014 03:37 PM


Wingware News

Wing IDE 5.1 beta1: December 16, 2014

This first beta release of Wing 5.1 adds multi-process debugging and optional automatic child process debugging, syntax highlighting and data value tooltips in the Python Shell and Debug Probe, support for Django 1.7, and about 55 minor features and bug fixes.

December 16, 2014 01:00 AM


Thomas Guest

Why zip when you can map?

Why zip? when you can map?

You’ve got a couple of parallel lists you’d like to combine and output, a line for each pair. Here’s one way to do it: use zip to do the combining.

>>> times = [42.12, 42.28, 42.34, 42.40, 42.45]
>>> names = ['Hickman', 'Guest', 'Burns', 'Williams']
>>> fmt = '{:20} {:.2f}'.format
>>> print('\n'.join(fmt(n, t) for n, t in zip(names, times)))
Hickman              42.12
Guest                42.28
Burns                42.34
Williams             42.40

Slightly more succinctly:

>>> print('\n'.join(fmt(*nt) for nt in zip(names, times)))
...

If you look at the generator expression passed into str.join, you can see we’re just mapping fmt to the zipped names and times lists.

Well, sort of.

>>> print('\n'.join(map(fmt, zip(names, times))))
Traceback (most recent call last):
...
IndexError: tuple index out of range

To fix this, we could use itertools.starmap which effectively unpacks the zipped pairs.

>>> from itertools import starmap
>>> print('\n'.join(starmap(fmt, zip(names, times))))
Hickman              42.12
Guest                42.28
Burns                42.34
Williams             42.40

This latest version looks clean enough but there’s something odd about zipping two lists together only to unpack the resulting 2-tuples for consumption by the format function.

Don’t forget, map happily accepts more than one sequence! There’s no need to zip after all.

Don’t zip, map!
>>> print('\n'.join(map(fmt, names, times)))
...

December 16, 2014 12:00 AM

December 15, 2014


Wes Mason

Snappy and "ImportError: No module named 'click.repository'"

If you're playing around with Snappy and Ubuntu Core, the new transaction based packaging platform in town, and receive the following error:

ImportError: No module named 'click.repository'
It's most likely because you also have the ubuntu-sdk PPA installed, and a new version of python3-click has been pushed up that doesn't include the repository submodule that Snappy uses. I found unravelling the two project's dependencies to be non-trivial at the moment, so here's an alternative quick fix:
  1. sudo apt purge -y python3-click
  2. Create a new file at /etc/apt/preferences.d/snappy-ppa-1500, with these contents (telling apt that we prefer duplicate packages from the snappy-dev PPA).
  3. sudo apt update
  4. sudo apt install -y --force-yes snappy-tools
For the time being this should select the packages we need from the snappy-dev PPA, and giving us the right python3-click, complete with repository submodule. The other alternative is to confine your Snappy experiments to an LXC or VM; or completely purge both PPAs and all the packages from them and reinstall just Snappy if you don't care about Ubuntu Touch or regular click package dev right now. I'm sure the conflict will be fixed soon though. :-)

December 15, 2014 11:26 PM


Enthought

Plotting in Excel with PyXLL and Matplotlib

Author: Tony Roberts, creator of PyXLL, a Python library that makes it possible to write add-ins for Microsoft Excel in Python. Download a FREE 30 day trial of PyXLL here.


Plotting in Excel with PyXLL and MatplotlibPython has a broad range of tools for data analysis and visualization. While Excel is able to produce various types of plots, sometimes it’s either not quite good enough or it’s just preferable to use matplotlib.

Users already familiar with matplotlib will be aware that when showing a plot as part of a Python script the script stops while a plot is shown and continues once the user has closed it. When doing the same in an IPython console when a plot is shown control returns to the IPython prompt immediately, which is useful for interactive development.

Something that has been asked a couple of times is how to use matplotlib within Excel using PyXLL. As matplotlib is just a Python package like any other it can be imported and used in the same way as from any Python script. The difficulty is that when showing a plot the call to matplotlib blocks and so control isn’t returned to Excel until the user closes the window.

This blog shows how to plot data from Excel using matplotlib and PyXLL so that Excel can continue to be used while a plot window is active, and so that same window can be updated whenever the data in Excel is updated.

Basic plotting

Matplotlib can plot just about anything you can imagine! For this blog I’ll be using only a very simple plot to illustrate how it can be done in Excel. There are examples of hundreds of other types of plots on the matplotlib website that can all be used in exactly the same way as this example in Excel.

To start off we’ll write a simple function that takes two columns of data (our x and y values), calculates the exponentially weighted moving average (EWMA) of the y values, and then plot them together as a line plot.

Note that our function could take a pandas dataframe or series quite easily, but just to keep things as simple as possible I’ll stick to plain numpy arrays. To see how to use pandas datatypes with PyXLL see the pandas examples on github: https://github.com/pyxll/pyxll-examples/tree/master/pandas.

from pyxll import xl_func
from pandas.stats.moments import ewma
import matplotlib.pyplot as plt

@xl_func("numpy_column<float> xs, "
         "numpy_column<float> ys, "
         "int span: string")
def mpl_plot_ewma(xs, ys, span):
    # calculate the moving average
    ewma_ys = ewma(ys, span=span)

    # plot the data
    plt.plot(xs, ys, alpha=0.4, label="Raw")
    plt.plot(xs, ewma_ys, label="EWMA")
    plt.legend()

    # show the plot
    plt.show()

    return "Done!"

To add this code to Excel save it to a Python file and add it to the pyxll.cfg file (see for details).

Calling this function from Excel brings up a matplotlib window with the expected plot. However, Excel won’t respond to any user input until after the window is closed as the plt.show() call blocks until the window is closed.

matplotlib-blocking

The unsmoothed data is generated with the Excel formula =SIN(B9)+SIN(B9*10)/3+SIN(B9*100)/7. This could just as easily be data retrieved from a database or the output from another calculation.

Non-blocking plotting

Matplotlib has several backends which enables it to be used with different UI toolkits.

Qt is a popular UI toolkit with Python bindings, one of which is PySide. Matplotlib supports this as a backend, and we can use it to show plots in Excel without using the blocking call plt.show(). This means we can show the plot and continue to use Excel while the plot window is open.

In order to make a Qt application work inside Excel it needs to be polled periodically from the main windows loop. This means it will respond to user inputs without blocking the Excel process, or stopping Excel from receiving user input. Using the windows ‘timer’ module is an easy way to do this. Using the timer module has the advantage that it keeps all the UI code in the same thread as Excel’s main window loop, which keeps things simple.

from PySide import QtCore, QtGui
import timer

def get_qt_app():
    """
    returns the global QtGui.QApplication instance and starts
    the event loop if necessary.
    """
    app = QtCore.QCoreApplication.instance()
    if app is None:
        # create a new application
        app = QtGui.QApplication([])

        # use timer to process events periodically
        processing_events = {}
        def qt_timer_callback(timer_id, time):
            if timer_id in processing_events:
                return
            processing_events[timer_id] = True
            try:
                app = QtCore.QCoreApplication.instance()
                if app is not None:
                    app.processEvents(QtCore.QEventLoop.AllEvents, 300)
            finally:
                del processing_events[timer_id]

        timer.set_timer(100, qt_timer_callback)

    return app

This can be used to embed any Qt windows and dialogs in Excel, not just matplotlib windows.

Now all that’s left is to update the plotting function to plot to a Qt window instead of using pyplot.show(). Also we can give each plot a name so that when the data in Excel changes and our plotting function gets called again it re-plots to the same window instead of creating a new one each time.

from matplotlib.figure import Figure
from matplotlib.backends.backend_qt4agg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.backends.backend_qt4agg import NavigationToolbar2QT as NavigationToolbar

# dict to keep track of any plot windows
_plot_windows = {}

@xl_func("string figname, "
         "numpy_column<float> xs, "
         "numpy_column<float> ys, "
         "int span: string")
def mpl_plot_ewma(figname, xs, ys, span):
    """
    Show a matplotlib line plot of xs vs ys and ewma(ys, span)
    in an interactive window.

    :param figname: name to use for this plot's window
    :param xs: list of x values as a column
    :param ys: list of y values as a column
    :param span: ewma span
    """
    # create the figure and axes for the plot
    fig = Figure(figsize=(600, 600), dpi=72, facecolor=(1, 1, 1), edgecolor=(0, 0, 0))
    ax = fig.add_subplot(111)

    # calculate the moving average
    ewma_ys = ewma(ys, span=span)

    # plot the data
    ax.plot(xs, ys, alpha=0.4, label="Raw")
    ax.plot(xs, ewma_ys, label="EWMA")
    ax.legend()

    # Get the Qt app.
    # Note: no need to 'exec' this as it will be polled in the main windows loop.
    app = get_qt_app()

    # generate the canvas to display the plot
    canvas = FigureCanvas(fig)

    # Get or create the Qt windows to show the chart in.
    if figname in _plot_windows:
        # get the existing window from the global dict and
        # clear any previous widgets
        window = _plot_windows[figname]
        layout = window.layout()
        if layout:
            for i in reversed(range(layout.count())):
                layout.itemAt(i).widget().setParent(None)
    else:
        # create a new window for this plot and store it for next time
        window = QtGui.QWidget()
        window.resize(800, 600)
        window.setWindowTitle(figname)
        _plot_windows[figname] = window

    # create the navigation toolbar
    toolbar = NavigationToolbar(canvas, window)

    # add the canvas and toolbar to the window
    layout = window.layout() or QtGui.QVBoxLayout()
    layout.addWidget(canvas)
    layout.addWidget(toolbar)
    window.setLayout(layout)

    # showing the window won't block
    window.show()

    return "[Plotted '%s']" % figname

matplotlib-nonblocking

When the function’s called it brings up the plot in a new window and control returns immediately to Excel. The plot window can be interacted with and Excel still responds to user input in the usual way.

When the data in the spreadsheet changes the plot function is called again and it redraws the plot in the same window.

Next steps

The code above could be refined and the code for creating, fetching and clearing the windows could be refactored into some reusable utility code. It was presented in a single function for clarity.

Plotting to a separate window from Excel is sometimes useful, especially as the interactive controls can be used and may be incorporated into other Qt dialogs. However, sometimes it’s nicer to be able to present a graph in Excel as a control in the Excel grid in the same way the native Excel charts work. This is possible using PyXLL and matplotlib and will be the subject of the next blog!

All the code from this blog is available on github https://github.com/pyxll/pyxll-examples/tree/master/matplotlib.

Additional Resources:

December 15, 2014 07:27 PM


Philippe Normand

Web Engines Hackfest 2014

Last week I attended the Web Engines Hackfest. The event was sponsored by Igalia (also hosting the event), Adobe and Collabora.

As usual I spent most of the time working on the WebKitGTK+ GStreamer backend and Sebastian Dröge kindly joined and helped out quite a bit, make sure to read his post about the event!

We first worked on the WebAudio GStreamer backend, Sebastian cleaned up various parts of the code, including the playback pipeline and the source element we use to bridge the WebCore AudioBus with the playback pipeline. On my side I finished the AudioSourceProvider patch that was abandoned for a few months (years) in Bugzilla. It’s an interesting feature to have so that web apps can use the WebAudio API with raw audio coming from Media elements.

I also hacked on GstGL support for video rendering. It’s quite interesting to be able to share the GL context of WebKit with GStreamer! The patch is not ready yet for landing but thanks to the reviews from Sebastian, Mathew Waters and Julien Isorce I’ll improve it and hopefully commit it soon in WebKit ToT.

Sebastian also worked on Media Source Extensions support. We had a very basic, non-working, backend that required… a rewrite, basically :) I hope we will have this reworked backend soon in trunk. Sebastian already has it working on Youtube!

The event was interesting in general, with discussions about rendering engines, rendering and JavaScript.

December 15, 2014 04:51 PM


Django Weblog

Announcing a redesign of the Django websites

The Django project is excited to announce that after many years, we're launching a redesign of our primary website, our documentation site and our issue tracker.

Django's website has been largely unchanged since the project was launched back in 2005, so you can imagine how excited we are to update it. The original design was created by Wilson Miner while he was working at the Lawrence Journal-World, the newspaper at which Django was created.. Wilson's design has held up incredibly well over the years, but new design aesthetics, and technologies such as mobile devices, web fonts, HTML5 and CSS3 have drastically changed the way websites are built.

The old design was also focused on introducing a new web framework to the world. Django is now a well-established framework, so the website has a much broader audience -- not just new Django users, but established users, managers, and people new to programming. This redesign also allows us to shine a spotlight on areas of the community that have historically been hidden, such as the Django Software Foundation (DSF), the community of projects that support Django developers (such as people.djangoproject.com and djangopackages.com), and the various educational and consulting resources that exist in our community.

This redesign is the result of multiple attempts and the collaboration of a number of groups and individuals. Work on the redesign started in 2010. Initially, a number of people (including Christian Metts and Julien Phalip) tried to produce a new design as individual efforts; however, these efforts stalled due to a lack of momentum. In 2012, the DSF developed a design brief and put out a call for a volunteer team to redesign the site. The DSF received a number of applicants and selected interactive agency Threespot to complete the design task. For a number of reasons (almost entirely the DSF's fault), this design got most of the way to completion, but not 100% complete.

Earlier this year, Andrew McCarthy took on the task of completing the design work including a style guide for future expansions. The design was then handed over to the DSF's website working group to convert that website into working code.

Since everyone is a volunteer on this team we'd like to name them individually: Adrian Holovaty, Audrey Roy, Aymeric Augustin, Baptiste Mispelon, Daniel Roy Greenfeld, Elena Williams, Jannis Leidel, Ola Sitarska, Ola Sendecka, Russell Keith-Magee, Tomek Paczkowski and Trey Hunner. One of the DSF's current fellows Tim Graham also helped by finding bugs and reviewing tickets. Of course we couldn't have done it without the backing of the DSF board of directors over the years.

Now we'd like to invite you to share in the result of our efforts and help us making it even better. Please test-drive the site and let us know what you think.

If you find a bug -- which we're sure some will -- open a ticket on the website's issue tracker. If you want to contribute directly to the site's code please don't hesitate to join us on Freenode in the channel #django-websites.

We also wouldn't mind if you'd tell us about your experience on Twitter using the hashtag #10YearsLater, or by tweeting at @djangoproject.

So now, without further ado, please check out the new site djangoproject.com, the documentation docs.djangoproject.com and our issue tracker code.djangoproject.com.

That's all for now. Happy coding, everyone!

...and we'll see you all again in 2023 when we launch our next redesign :-)

December 15, 2014 04:08 PM