skip to navigation
skip to content

Planet Python

Last update: December 12, 2017 04:47 PM

December 12, 2017

Kushal Das

Qubes OS 4.0rc3 and latest UEFI systems

Last week I received a new laptop, I am going to use it as my primary work station. The first step was to install Qubes OS 4.0rc3 on the system. It is a Thinkpad T470 with 32GB RAM and a SSD drive.

How to install Qubes on the latest UEFI systems?

A few weeks back, a patch was merged to the official Qubes documentation, which explains in clear steps how to create a bootable USB drive on a Fedora system using livecd-tools. Please follow the guide and create a USB drive which will work on these latest machines. Just simply using dd will not help.

First step after installing Qubes

I upgraded the dom0 to the current testing packages using the following command.

$ sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing
$ sudo qubes-dom0-update qubes-template-fedora-26

I also installed the Fedora 26 template on my system using the next command. One of the important point to remember that Fedora 25 is going to be end of life today. So, better to use updated version of the distribution :)

There was another important thing happened in the last two weeks. I was in the Freedom of the Press Foundation office in San Fransisco. Means not only I managed to meet my amazing team, I also met many of my personal heroes in this trip. I may write a separate blog post about that later. But for now I can say that I managed to sit near to Micah Lee for 2 weeks and learn a ton about various things, including his Qubes workflow. The following two things were the first change I did to my installation (with his guidance) to make things working properly.

How to modify the copy-paste between domains shortcuts?

Generally Ctrl+Shift+c and Ctrl+Shift+v are used to copy-paste securely between different domains. But, those are the shortcuts to copy-paste from the terminal in all the systems. So, modifying them to a different key combination is very helpful for the muscle memory :)

Modify the following lines in the /etc/qubes/guid.conf file in dom0, I did a reboot after that to make sure that I am using this new key combination.

secure_copy_sequence = “Mod-c”;
secure_paste_sequence = “Mod-v”;

The above configuration will modify the copy paste shortcuts to Windows+c and Windows+v in my keyboard layout.

Fixing the wireless driver issue in suspend/resume

I also found that if I suspend the system, after starting it on again, the wireless device was missing from the sys-net domain. Adding the following two module in the /rw/config/suspend-module-blacklist file on the sys-net domain helped me to fix that.


The official documentation has a section on the same.

You can follow my posts on Qubes OS here.

December 12, 2017 03:19 PM


Python Education Summit celebrates its 6th year in 2018

Teachers, educators, and Python users: come and share your projects, experiences, and tools of the trade you use to teach coding and Python to your students. The Annual Python Education Summit is held in conjunction with PyCon 2018, taking place on Thursday May 10. Our Call for Proposals is open until January 3rd, and we want to hear from you! See for more details.

What we look for in Education Summit talks are ideas, experiences, and best practices on how teachers and programmers have implemented instruction in their schools, communities, books, tutorials, and other places of learning by using Python.

We urge anyone in this space to submit a talk! We’re looking for people who want to share their knowledge and leverage the experience of their peers in bettering the education fields around Python. You do not need to be an experienced speaker to apply!

This year, talks that focus on the challenges and triumphs of implementing programming education are especially encouraged.

About the Python Education Summit

In 2018, the focus will be to bring educators and their experiences from diverse categories, to the forefront. Building on last year’s successful participation by young coders, this year we again urge young programmers to submit to speak about their learning experiences or present a demonstration of their coding projects.

We hope to see you at the Education Summit in 2018! January 3 is the deadline for submissions, so pen down your thoughts and ideas and submit them to us in your dashboard at For more infomation about the summit, see

Registration for the Education Summit will open in January 2018. A formal announcement will be made via @pycon and here on the PyCon blog.

Be on the lookout for more details and we hope to see you there!

Written by Meenal Pant
Edited by Brian Curtin

December 12, 2017 10:55 AM


Python Arrays

In this article, you’ll learn about python array. Before getting started, you should be familiar with python, variables and datatypes.

December 12, 2017 08:04 AM

Python Matrix

In this article we will be learning about python matrices; how they are created, slicing of a matrix, adding or removing elements of a matrix.

December 12, 2017 07:52 AM

PyCon Pune

Welcome Brett Cannon, Our First Keynote Speaker

PyCon Pune 2018 is thrilled to welcome Brett Cannon, our first keynote speaker. “Came for the language and stayed for the community” - our motto is the quote that actually originated the idea of having PyCon Pune. The lines which was used in two different Python Conferences of India in 2017. Therefore apart from Python, the love for this quote is also binding the Python Community in India. Now is the our chance to meet the person behind that thought, in PyCon Pune 2018.

December 12, 2017 07:44 AM


Python feeding

I need an answer to this by someone with a bit more experience than me. Thank you.

December 12, 2017 06:38 AM

Mike Driscoll

Flask 101: Adding a Database

Last time we learned how to get Flask set up. In this article we will learn how to add a database to our music data website. As you might recall, Flask is a micro-web-framework. That means it doesn’t come with an Object Relational Mapper (ORM) like Django does. If you want to add database interactivity, then you need to add it yourself or install an extension. I personally like SQLAlchemy, so I thought it was nice that there is a ready-made extension for adding SQLAlchemy to Flask called Flask-SQLAlchemy.

To install Flask-SQLAlchemy, you just need to use pip. Make sure that you are in your activated virtual environment that we created in the first part of this series before you run the following or you’ll end up installing the extension to your base Python instead of your virtual environment:

pip install flask-sqlalchemy

Now that we have the Flask-SQLAlchemy installed along with its dependencies, we can get started creating a database!

Creating a Database

Creating a database with SQLAlchemy is actually pretty easy. SQLAlchemy supports a couple of different ways of working with a database. My favorite is using its declarative syntax that allows you to create classes that model the database itself. So I will use that for this example. We will be using SQLite as our backend too, however we could easily change that backend to something else, such as MySQL or Postgres if we wanted to.

To start out, we will look at how you create the database file using just normal SQLAlchemy. Then we will create a separate script that uses the slightly different Flask-SQLAlchemy syntax. Put the following codee into a file called

from sqlalchemy import create_engine, ForeignKey
from sqlalchemy import Column, Date, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, backref
engine = create_engine('sqlite:///mymusic.db', echo=True)
Base = declarative_base()
class Artist(Base):
    __tablename__ = "artists"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    def __init__(self, name):
        """""" = name
    def __repr__(self):
        return "<Artist: {}>".format(
class Album(Base):
    __tablename__ = "albums"
    id = Column(Integer, primary_key=True)
    title = Column(String)
    release_date = Column(Date)
    publisher = Column(String)
    media_type = Column(String)
    artist_id = Column(Integer, ForeignKey(""))
    artist = relationship("Artist", backref=backref(
        "albums", order_by=id))
    def __init__(self, title, release_date, publisher, media_type):
        self.title = title
        self.release_date = release_date
        self.publisher = publisher
        self.media_type = media_type
# create tables

The first part of this code should look pretty familiar to anyone using Python as all we are doing here is importing the bits and pieces we need from SQLAlchemy to make the rest of the code work. Then we create SQLAlchemy’s engine object, which basically connects Python to the database of choice. In this case, we are connecting to SQLite and creating a file instead of creating the database in memory. We also create a “base class” that we can use to create declarative class definitions that actually define our database tables.

The next two classes define the tables we care about, namely Artist and Album. You will note that we name the table via the __tablename__ class attribute. We also create the table’s columns and set their data types to whatever we need. The Album class is a bit more complex since we set up a ForeignKey relationship with the Artist table. You can read more about how this works in my old SQLAlchemy tutorial or if you want the in-depth details, then check out the well written documentation.

When you run the code above, you should get something like this in your terminal:

2017-12-08 18:36:43,290 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2017-12-08 18:36:43,291 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,292 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2017-12-08 18:36:43,292 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,294 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("artists")
2017-12-08 18:36:43,294 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,295 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("albums")
2017-12-08 18:36:43,295 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,296 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE artists (
	name VARCHAR, 
2017-12-08 18:36:43,296 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,315 INFO sqlalchemy.engine.base.Engine COMMIT
2017-12-08 18:36:43,316 INFO sqlalchemy.engine.base.Engine 
	title VARCHAR, 
	release_date DATE, 
	publisher VARCHAR, 
	media_type VARCHAR, 
	artist_id INTEGER, 
	FOREIGN KEY(artist_id) REFERENCES artists (id)
2017-12-08 18:36:43,316 INFO sqlalchemy.engine.base.Engine ()
2017-12-08 18:36:43,327 INFO sqlalchemy.engine.base.Engine COMMIT

Now let’s make all this work in Flask!

Using Flask-SQLAlchemy

The first thing we need to do when we go to use Flask-SQLAlchemy is to create a simple application script. We will call it Put the following code into this file and save it to the musicdb folder.

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///mymusic.db'
app.secret_key = "flask rocks!"
db = SQLAlchemy(app)

Here we create our Flask app object and tell it where the SQLAlchemy database file should live. We also set up a simple secret key and create a db object which allows us to integrate SQLAlchemy into Flask. Next we need to create a file and save it into the musicdb folder. Once you have that made, add the following code to it:

from app import db
class Artist(db.Model):
    __tablename__ = "artists"
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String)
    def __init__(self, name):
        """""" = name
    def __repr__(self):
        return "<Artist: {}>".format(
class Album(db.Model):
    __tablename__ = "albums"
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String)
    release_date = db.Column(db.Date)
    publisher = db.Column(db.String)
    media_type = db.Column(db.String)
    artist_id = db.Column(db.Integer, db.ForeignKey(""))
    artist = db.relationship("Artist", backref=db.backref(
        "albums", order_by=id), lazy=True)
    def __init__(self, title, release_date, publisher, media_type):
        self.title = title
        self.release_date = release_date
        self.publisher = publisher
        self.media_type = media_type

You will note that Flask-SQLAlchemy doesn’t require all the imports that just plain SQLAlchemy required. All we need is the db object we created in our app script. Then we just pre-pend “db” to all the classes we used in the original SQLAlchemy code. You will also note that instead of creating a Base class, it is already pre-defined as db.Model.

Finally we need to create a way to initialize the database. You could put this in several different places, but I ended up creating a file I dubbed and added the following contents:

from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine('sqlite:///mymusic.db', convert_unicode=True)
db_session = scoped_session(sessionmaker(autocommit=False,
Base = declarative_base()
Base.query = db_session.query_property()
def init_db():
    import models

This code will initialize the database with the tables you created in your models script. To make the initialization happen, let’s edit out script from the previous article:

from app import app
from db_setup import init_db
def test():
    return "Welcome to Flask!"
if __name__ == '__main__':

Here we just imported our app object and the init_db function. Then we called the init_db function immediately. To run this code, all you need to do is run the following command in your terminal from within the musicdb folder: flask run

When you run this, you won’t see the SQLAlchemy output that we saw earlier. Instead you will just see some information printed out stating that your Flask application is running. You will also find a mymusic.db file has been created in your musicdb folder.

Wrapping Up

At this point, you now have a web application with an empty database. You can’t add anything to the database with your web application or view anything in the database. Yes, you just created something really cool, but it’s also completely useless for your users. In the next article we will learn how to add forms to add information to our database and we will learn how to display our data too!

Download Code

Download a tarball of the code from this article: flask-musicdb-part_ii.tar

Other Articles in the Series

Related Readings

December 12, 2017 06:15 AM

Flask 101: Getting Started

The Flask 101 series is my attempt at learning the Flask microframework for Python. For those who haven’t heard of it, Flask is micro web framework for creating web applications in Python. According to their website, Flask is based on Werkzeug, Jinja 2 and good intentions. For this series of articles, I wanted to create a web application that would do something useful without being too complicated. So for my learning sanity, I decided to create a simple web application that I can use to store information about my music library.

Over the course of multiple articles, you will see how this journey unfolded.

Getting Setup

To get started using Flask, you will need to install it. We will create a virtual environment for this series of tutorials as there will be a number of other Flask dependencies that we will need to add and most people probably don’t want to pollute their main Python installation with a lot of cruft they may not end up using. So before we install Flask, let’s create a virtual environment using virtualenv. If you want to use virtualenv, then we will need to install that with pip:

pip install virtualenv

Now that we have that installed, we can create our virtual environment. Find a location on your local system where you want to store your web application. Then open up a terminal and run the following command:

virtualenv musicdb

On Windows, you might have to give the full path to virtualenv, which is usually something like C:\Python36\Scripts\virtualenv.exe.

Note that starting in Python 3.3, you can also use Python’s built-in venv module to create a virtual environment instead of using virtualenv. Of course, the virtualenv package can be installed in Python 3, so it’s up to you which you want to use. They work in pretty much the same way.

Once you have your virtual environment set up, you will need to activate it. To do that, you will need to change your directory in your terminal to the folder you just created using the “cd” command:

cd musicdb

If you are on a Linux or Mac OS, you should run the following:

source bin/activate

Windows is a bit different. You still need to “cd” into your folder, but the command to run is this:


For more details on activating and deactivating your virtual environment, check out the user guide.

You may have noticed that when you created your virtual environment, it copied in your Python executable as well as pip. This means that you can now install packages to your virtual environment using pip, which is the reason so many people like virtual environments. Once the virtual environment is activated, you should see that your terminal has changed to prepend the name of the virtual environment to the terminal’s prompt. Here’s an example screenshot using Python 2.7:

Now we’re ready to install Flask!

Getting Started with Flask

Flask is easy to install using the pip installer. Here’s how you can do it:

pip install flask

This command will install Flask and any of the dependencies that it needs. This is the output I received:

Collecting flask
Downloading Flask-0.12.2-py2.py3-none-any.whl (83kB)
100% |████████████████████████████████| 92kB 185kB/s
Collecting itsdangerous&gt;=0.21 (from flask)
Downloading itsdangerous-0.24.tar.gz (46kB)
100% |████████████████████████████████| 51kB 638kB/s
Collecting Jinja2&gt;=2.4 (from flask)
Downloading Jinja2-2.10-py2.py3-none-any.whl (126kB)
100% |████████████████████████████████| 133kB 277kB/s
Collecting Werkzeug&gt;=0.7 (from flask)
Downloading Werkzeug-0.12.2-py2.py3-none-any.whl (312kB)
100% |████████████████████████████████| 317kB 307kB/s
Collecting click&gt;=2.0 (from flask)
Downloading click-6.7-py2.py3-none-any.whl (71kB)
100% |████████████████████████████████| 71kB 414kB/s
Collecting MarkupSafe&gt;=0.23 (from Jinja2&gt;=2.4-&gt;flask)
Building wheels for collected packages: itsdangerous
Running bdist_wheel for itsdangerous ... done
Stored in directory: /home/mdriscoll/.cache/pip/wheels/fc/a8/66/24d655233c757e178d45dea2de22a04c6d92766abfb741129a
Successfully built itsdangerous
Installing collected packages: itsdangerous, MarkupSafe, Jinja2, Werkzeug, click, flask
Successfully installed Jinja2-2.10 MarkupSafe-1.0 Werkzeug-0.12.2 click-6.7 flask-0.12.2 itsdangerous-0.24

Now let’s write something simple to prove that Flask is working correctly. Save the following code in the musicdb folder that we created earlier.

from flask import Flask
app = Flask(__name__)
def test():
    return "Welcome to Flask!"

All this code does is import the Flask class and create an instance of it which we call app. Then we set up the default route for the home page (AKA root or index) of our website. This is done via the following decorator: @app.route(‘/’). Finally we create a function that just returns a string.

When this code is run in Flask, you will be able to navigate to your new web app’s home page and see that text. That brings us to how we run this code. In your terminal, make sure you are in your musicdb folder. Then run the following command in the terminal: flask run

When you run this command, you should see something like this in your terminal:

* Serving Flask app "test"
* Running on (Press CTRL+C to quit)

Now you just need to open up a browser, such as Chrome or Firefox, and go to the URL mentioned above: Here is what I got when I went to that URL in Chrome:

Wrapping Up

At this point you should be able to get a working version of Flask up and running. You can do some really basic web applications with just what you have right now. In the next article in this series, we will look at how to add database support to our web application.

Other Articles in the Series

December 12, 2017 06:05 AM

December 11, 2017


Starting D&I Conversations at Work — Notes from the DISC Unconference

December 11, 2017 08:51 PM

Weekly Python Chat

CSV Modules

This week we're going to talk about Python's csv module. We'll discuss how to use it and why it's important to use it. We'll also discuss some of the more advanced usages of the csv module.

December 11, 2017 07:30 PM

Filipe Saraiva

KDE Edu Sprint 2017

Two months ago I attended to KDE Edu Sprint 2017 at Berlin. It was my first KDE sprint (really, I send code to KDE software since 2010 and never went to a sprint!) so I was really excited for the event.

KDE Edu is the an umbrella for specific educational software of KDE. There are a lot of them and it is the main educational software suite in free software world. Despite it, KDE Edu has received little attention in organization side, for instance the previous KDE Edu sprint occurred several years ago, our website has some problems, and more.

Therefore, this sprint was an opportunity not only for developers work in software development, but for works in organization side as well.

In organization work side, we discuss about the rebranding of some software more related to university work than for “education” itself, like Cantor and Labplot. There was a wish to create something like a KDE Research/Science in order to put software like them and others like Kile and KBibTex in a same umbrella. There is a discussion about this theme.

Other topic in this point was the discussions about a new website, more oriented to teach how to use KDE software in educational context than present a set of software. In fact, I think we need to do it and strengthen the “KDE Edu brand” in order to have a specific icon+link in KDE products page.

Follow, the developers in the sprint agreed with the multi operating system policy for KDE Edu. KDE software can be built and distributed to users of several OS, not only Linux. During the sprint some developers worked to bring installers for Windows, Mac OS, porting applications to Android, and creating independent installers for Linux distributions using flatpak.

Besides the discussions in this point, I worked to bring a rule to send e-mail to KDE Edu mailing list for each new Differential Revisions of KDE Edu software in Phabricator. Sorry devs, our mailboxes are full of e-mails because me.

Now in development work side, my focus was work hard on Cantor. First, I made some task triage in our workboard, closing, opening, and putting more information in some tasks. Secondly, I reviewed some works made by Rishabh Gupta, my student during GSoC 2017. He ported the Lua and R backend to QProcess and it will be available soon.

After it I worked to port Python 3 backend to Python/C API. This work is in progress and I expect to finish it to release in 18.04.

Of course, besides this amount of work we have fun with some beers and German food (and some American food and Chinese food and Arab food and Italian food as well)! I was happy because my 31 years birthday was in the first day of the sprint, so thank you KDE for coming to my birthday party full of code and good beers and pork dishes. 🙂

To finish, it is always a pleasure to meet the gearheads like my Spanish friends Albert and Aleix, the only other Mageia user I found personally in my life Timothée, my GSoC student Rishabh, my irmão brasileiro Sandro, and the new friends Sanjiban and David.

Thank you KDE e.V for provide resources to the sprint and thank you Endocode for hosting the sprint.

December 11, 2017 03:22 PM

Real Python

Building a Simple Web App with Bottle, SQLAlchemy, and the Twitter API

daily python tip

This is a guest blog post by Bob Belderbos. Bob is a driven Pythonista working as a software developer at Oracle. He is also co-founder of PyBites, a Python blog featuring code challenges, articles, and news. Bob is passionate about automation, data, web development, code quality, and mentoring other developers.

Last October we challenged our PyBites’ audience to make a web app to better navigate the Daily Python Tip feed. In this article, I’ll share what I built and learned along the way.

In this article you will learn:

  1. How to clone the project repo and set up the app.
  2. How to use the Twitter API via the Tweepy module to load in the tweets.
  3. How to use SQLAlchemy to store and manage the data (tips and hashtags).
  4. How to build a simple web app with Bottle, a micro web-framework similar to Flask.
  5. How to use the pytest framework to add tests.
  6. How Better Code Hub’s guidance led to more maintainable code.

If you want to follow along, reading the code in detail (and possibly contribute), I suggest you fork the repo. Let’s get started.

Project Setup

First, Namespaces are one honking great idea so let’s do our work in a virtual environment. Using Anaconda I create it like so:

$ virtualenv -p <path-to-python-to-use> ~/virtualenvs/pytip

Create a production and a test database in Postgres:

$ psql
psql (9.6.5, server 9.6.2)
Type "help" for help.

# create database pytip;
# create database pytip_test;

We’ll need credentials to connect to the the database and the Twitter API (create a new app first). As per best practice configuration should be stored in the environment, not the code. Put the following env variables at the end of ~/virtualenvs/pytip/bin/activate, the script that handles activation / deactivation of your virtual environment, making sure to update the variables for your environment:

export DATABASE_URL='postgres://postgres:password@localhost:5432/pytip'
# twitter
export CONSUMER_KEY='xyz'
export CONSUMER_SECRET='xyz'
export ACCESS_TOKEN='xyz'
export ACCESS_SECRET='xyz'
# if deploying it set this to 'heroku'
export APP_LOCATION=local

In the deactivate function of the same script, I unset them so we keep things out of the shell scope when deactivating (leaving) the virtual environment:


Now is a good time to activate the virtual environment:

$ source ~/virtualenvs/pytip/bin/activate

Clone the repo and, with the virtual environment enabled, install the requirements:

$ git clone && cd pytip
$ pip install -r requirements.txt

Next, we import the collection of tweets with:

$ python tasks/

Then, verify that the tables were created and the tweets were added:

$ psql

\c pytip

pytip=# \dt
          List of relations
 Schema |   Name   | Type  |  Owner
 public | hashtags | table | postgres
 public | tips     | table | postgres
(2 rows)

pytip=# select count(*) from tips;
(1 row)

pytip=# select count(*) from hashtags;
(1 row)

pytip=# \q

Now let’s run the tests:

$ pytest
========================== test session starts ==========================
platform darwin -- Python 3.6.2, pytest-3.2.3, py-1.4.34, pluggy-0.4.0
rootdir: realpython/pytip, inifile:
collected 5 items

tests/ .
tests/ ....

========================== 5 passed in 0.61 seconds ==========================

And lastly run the Bottle app with:

$ python

Browse to http://localhost:8080 and voilà: you should see the tips sorted descending on popularity. Clicking on a hashtag link at the left, or using the search box, you can easily filter them. Here we see the pandas tips for example:

daily python tip

The design I made with MUI – a lightweight CSS framework that follows Google’s Material Design guidelines.

Implementation Details

The DB and SQLAlchemy

I used SQLAlchemy to interface with the DB to prevent having to write a lot of (redundant) SQL.

In tips/, we define our models – Hashtag and Tip – that SQLAlchemy will map to DB tables:

from sqlalchemy import Column, Sequence, Integer, String, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Hashtag(Base):
    __tablename__ = 'hashtags'
    id = Column(Integer, Sequence('id_seq'), primary_key=True)
    name = Column(String(20))
    count = Column(Integer)

    def __repr__(self):
        return "<Hashtag('%s', '%d')>" % (, self.count)

class Tip(Base):
    __tablename__ = 'tips'
    id = Column(Integer, Sequence('id_seq'), primary_key=True)
    tweetid = Column(String(22))
    text = Column(String(300))
    created = Column(DateTime)
    likes = Column(Integer)
    retweets = Column(Integer)

    def __repr__(self):
        return "<Tip('%d', '%s')>" % (, self.text)

In tips/, we import these models, and now it’s easy to work with the DB, for example to interface with the Hashtag model:

def get_hashtags():
    return session.query(Hashtag).order_by(


def add_hashtags(hashtags_cnt):
    for tag, count in hashtags_cnt.items():
        session.add(Hashtag(name=tag, count=count))

Query the Twitter API

We need to retrieve the data from Twitter. For that, I created tasks/ I packaged this under tasks because it should be run in a daily cronjob to look for new tips and update stats (number of likes and retweets) on existing tweets. For the sake of simplicity I have the tables recreated daily. If we start to rely on FK relations with other tables we should definitely choose update statements over delete+add.

We used this script in the Project Setup. Let’s see what it does in more detail.

First, we create an API session object which we pass to tweepy.Cursor. This feature of the API is really nice: it deals with pagination, iterating through the timeline. For the amount of tips – 222 at the time I write this – it’s really fast. The exclude_replies=True and include_rts=False arguments are convenient because we only want Daily Python Tip’s own tweets (not re-tweets).

Extracting hashtags from the tips requires very little code.

First, I defined a regex for a tag:

TAG = re.compile(r'#([a-z0-9]{3,})')

Then, I used findall to get all tags.

I passed them to collections.Counter which returns a dict like object with the tags as keys, and counts as values, ordered in descending order by values (most common). I excluded the too common python tag which would skew the results.

def get_hashtag_counter(tips):
    blob = ' '.join(t.text.lower() for t in tips)
    cnt = Counter(TAG.findall(blob))

        cnt.pop('python', None)

    return cnt

Finally, the import_* functions in tasks/ do the actual import of the tweets and hashtags, calling add_* DB methods of the tips directory/package.

Make a Simple web app with Bottle

With this pre-work done, making a web app is surprisingly easy (or not so surprising if you used Flask before).

First of all meet Bottle:

Bottle is a fast, simple and lightweight WSGI micro web-framework for Python. It is distributed as a single file module and has no dependencies other than the Python Standard Library.

Nice. The resulting web app comprises of < 30 LOC and can be found in

For this simple app, a single method with an optional tag argument is all it takes. Similar to Flask, the routing is handled with decorators. If called with a tag it filters the tips on tag, else it shows them all. The view decorator defines the template to use. Like Flask (and Django) we return a dict for use in the template.

def index(tag=None):
    tag = tag or request.query.get('tag') or None
    tags = get_hashtags()
    tips = get_tips(tag)

    return {'search_tag': tag or '',
            'tags': tags,
            'tips': tips}

As per documentation, to work with static files, you add this snippet at the top, after the imports:

def send_static(filename):
    return static_file(filename, root='static')

Finally, we want to make sure we only run in debug mode on localhost, hence the APP_LOCATION env variable we defined in Project Setup:

if os.environ.get('APP_LOCATION') == 'heroku':
    run(host="", port=int(os.environ.get("PORT", 5000)))
    run(host='localhost', port=8080, debug=True, reloader=True)

Bottle Templates

Bottle comes with a fast, powerful and easy to learn built-in template engine called SimpleTemplate.

In the views subdirectory I defined a header.tpl, index.tpl, and footer.tpl. For the tag cloud, I used some simple inline CSS increasing tag size by count, see header.tpl:

% for tag in tags:
  <a style="font-size: {{ tag.count/10 + 1 }}em;" href="/{{ }}">#{{ }}</a>&nbsp;&nbsp;
% end

In index.tpl we loop over the tips:

% for tip in tips:
  <div class='tip'>
    <pre>{{ !tip.text }}</pre>
    <div class="mui--text-dark-secondary"><strong>{{ tip.likes }}</strong> Likes / <strong>{{ tip.retweets }}</strong> RTs / {{ tip.created }} / <a href="{{ tip.tweetid }}" target="_blank">Share</a></div>
% end

If you are familiar with Flask and Jinja2 this should look very familiar. Embedding Python is even easier, with less typing – (% ... vs {% ... %}).

All css, images (and JS if we’d use it) go into the static subfolder.

And that’s all there is to making a basic web app with Bottle. Once you have the data layer properly defined it’s pretty straightforward.

Add tests with pytest

Now let’s make this project a bit more robust by adding some tests. Testing the DB required a bit more digging into the pytest framework, but I ended up using the pytest.fixture decorator to set up and tear down a database with some test tweets.

Instead of calling the Twitter API, I used some static data provided in tweets.json. And, rather than using the live DB, in tips/, I check if pytest is the caller (sys.argv[0]). If so, I use the test DB. I probably will refactor this, because Bottle supports working with config files.

The hashtag part was easier to test (test_get_hashtag_counter) because I could just add some hashtags to a multiline string. No fixtures needed.

Code quality matters – Better Code Hub

Better Code Hub guides you in writing, well, better code. Before writing the tests the project scored a 7:

better code hub

Not bad, but we can do better:

  1. I bumped it to a 9 by making the code more modular, taking the DB logic out of the (web app), putting it in the tips folder/ package (refactorings 1 and 2)

  2. Then with the tests in place the project scored a 10:

better code hub

Conclusion and Learning

Our Code Challenge #40 offered some good practice:

  1. I built a useful app which can be expanded (I want to add an API).
  2. I used some cool modules worth exploring: Tweepy, SQLAlchemy, and Bottle.
  3. I learned some more pytest because I needed fixtures to test interaction with the DB.
  4. Above all, having to make the code testable, the app became more modular which made it easier to maintain. Better Code Hub was of great help in this process.
  5. I deployed the app to Heroku using our step-by-step guide.

We Challenge You

The best way to learn and improve your coding skills is to practice. At PyBites we solidified this concept by organizing Python code challenges. Check out our growing collection, fork the repo, and get coding!

Let us know if you build something cool by making a Pull Request of your work. We have seen folks really stretching themselves through these challenges, and so did we.

Happy coding!

Contact Info

I am Bob Belderbos from PyBites, you can reach out to me by:

December 11, 2017 02:47 PM

Possbility and Probability

Example of great documentation in Python

Documentation is one of those tasks and programming that does not get as much attention as it probably should. Great documentation is even more rare. We live in an age of unlimited free information in the form of blog posts … Continue reading

The post Example of great documentation in Python appeared first on Possibility and Probability.

December 11, 2017 02:04 PM

Doug Hellmann

subprocess — Spawning Additional Processes — PyMOTW 3

The subprocess module supports three APIs for working with processes. The run() function, added in Python 3.5, is a high-level API for running a process and optionally collecting its output. The functions call() , check_call() , and check_output() are the former high-level API, carried over from Python 2. They are still supported and widely used …

December 11, 2017 02:00 PM


Developing in a VM with Vagrant and Ansible

One of the things that could make developing cloud applications hard, would be differences between the dev environment and the production environment. This is why one of the factors of the twelve factor app is maintaining dev-prod parity. Today we’ll start a blog series about developing cloud applications, and we’ll discuss how to set up a local development environment using Vagrant.

We’ll use these technologies for this application:

Today we’ll just create a simple Flask application that’ll say ‘Hello world’. In the next post in this series, we’ll introduce a larger application that we’ll deploy to AWS in a future post.

If you want to follow along at home, you can find the code from today’s blog post on GitHub. See the commit history there to see the progress from the beginning to the end.

Getting Started

So let’s create a project, and get started. If you want to follow along, you’ll need to have Vagrant, Virtualbox, and PyCharm Professional Edition installed on your computer.

Open PyCharm, and create a new pure Python project.

The first step will be to set up the Vagrant VM, and configure the necessary items. In the project folder, run vagrant init -m bento/ubuntu-16.04. You can run commands within PyCharm by opening the terminal (Alt + F12).

This generates a Vagrantfile that only contains the base box that we’re using. If we run vagrant up at this point, we’d get a plain Ubuntu server box. For our project we’ll need to install some things and expose some ports though, so let’s add this to the Vagrantfile:

Vagrant.configure("2") do |config| = "bento/ubuntu-16.04" "forwarded_port", guest: 5000, host: 5000
 config.vm.provision "ansible_local" do |a|
   a.playbook = "setup.yml"

The ansible_local provisioner will install Ansible on the Ubuntu VM and then run it there, this means we don’t need to install Ansible on our host computer. Ansible lets us describe the desired state for a computer, and will then make the necessary changes to achieve that state. So let’s have a look at what’s necessary to install Python 3.6 on the VM.

Provisioning a VM with Ansible

Ansible works with Playbooks. These are YAML files that describe what state should be applied to what machines. Let’s create setup.yml, and try to install Python 3.6:

- hosts: all
  become: yes # This means that all tasks will be executed with sudo
   - name: Install Python 3.6
       name: python3.6
       state: present
       update_cache: yes

A playbook is a list of plays on the top level. We can configure per play which hosts we want to apply it to, whether we need to become another user, and a list of tasks. In our example, we apply the play to all hosts: there’s only one host in the Vagrant setup, so that’s easy enough. We also set become to yes, which has the effect of running our tasks with sudo.

The tasks are the way we can configure the desired state of our VM. We can name our tasks to make it easier for us to see what’s going on, but Ansible doesn’t technically need it. The task we have here is just an instruction for Ansible to use the apt module, which is bundled with Ansible. We specify three options to the apt module:

This last option basically means that Ansible will run apt update before running apt install, if necessary.

If you’re thinking, isn’t this just a very hard way to write sudo apt update && sudo apt install python3.6, at this point you’re right. However, the value of Ansible is that you’re not describing actions, but you’re describing a desired state. So the second time you run Ansible, it detects Python 3.6 is already installed, and it won’t do anything. Idempotence is one of Ansible’s core principles. Another key benefit is that you can version control changes to server configuration.

So let’s run vagrant up (Ctrl+Shift+A to Find action, and then type vagrant up), and we should have a VM with Python 3.6!

Trouble in Paradise

TASK [Install Python 3.6] ******************************************************

fatal: [default]: FAILED! => {"changed": false, "msg": "No package matching 'python3.6' is available"}

       to retry, use: --limit @/vagrant/setup.retry

Unfortunately, Python 3.6 isn’t available from Ubuntu’s default package repositories. There are several ways to resolve this situation, the easiest would be to find a PPA (Personal Package Archive) which has Python 3.6.

A PPA which is mentioned in many places on the internet is Jonathon F’s PPA. So how would we go about adding this PPA using Ansible? Turns out there are two modules that can help us out here, apt_key and apt_repository. Apt_key allows us to specify the public key associated with the repository, to make sure any releases we get are really from Jonathon. And apt_repository then adds the repository to the apt configuration. So let’s add these two tasks to the playbook, before the install task (Ansible runs tasks in the order specified):

- name: Add key for jonathonf PPA
    id: 4AB0F789CBA31744CC7DA76A8CF63AD3F06FC659
    state: present
- name: Add jonathonf PPA
    repo: deb xenial main
    state: present

Now run vagrant provision (or Tools | Vagrant | Provision), to rerun the playbook. After completing, we should see the summary:

PLAY RECAP *********************************************************************
default                    : ok=4    changed=3    unreachable=0    failed=0

At this point, let’s create a requirements.txt with the libraries we’ll use today, in this case, just Flask:


Most Linux distributions use the system interpreter themselves, that’s one of the reasons for virtualenvs being best practice. So let’s create a virtualenv, and then install these packages. As the python-3.6 package didn’t include pip, we’ll first need to install pip. Then, using pip, we’ll need to install virtualenv into the system interpreter. After that we’ll be able to create a new virtualenv with the requirements we specify. To do this, specify at the end of the playbook:

- name: Install pip3
    name: python3-pip
    state: present
    update_cache: yes
- name: Install 'virtualenv' package
    name: virtualenv
    executable: pip3
- name: Create virtualenv
  become: no
    virtualenv: "/home/vagrant/venv"
    virtualenv_python: python3.6
    requirements: "/vagrant/requirements.txt"

First, we’re using the apt module to install pip. Then, we’re using Ansible’s pip module to install the virtualenv package. And finally we’re using the pip module again to now create the virtualenv, and then install the packages in the newly created virtualenv. Vagrant automatically mounts the project directory in the /vagrant folder in the VM, so we can refer to our requirements.txt file this way.

At this point we have our Python environment ready, and we could continue going the same way to add a database and anything else we might desire. Let’s have a look to see how we can organize our playbook further. Firstly, we’ve now hardcoded paths with ‘vagrant’, which prevents us from reusing the same playbook later on AWS. Let’s change this:

- hosts: all
  become: yes # This means that all tasks will be executed with sudo
    venv_path: "/home/vagrant/venv"
    requirements_path: "/vagrant/requirements.txt"
… snip … 
  - name: Create virtualenv
    become: no
      virtualenv: "{{ venv_path }}"
      virtualenv_python: python3.6
      requirements: "{{ requirements_path }}"

The first thing we can do is define variables for these paths. If the variable syntax looks familiar, that’s because it is: Ansible is written in Python, and uses jinja2 for templating.

If we were to add database plays to the same playbook, we’re mixing things that we may want to separate later. Wouldn’t it be easier to have these Python plays somewhere we can call them, and have the database plays in another place? This is possible using Ansible roles. Let’s refactor this playbook into a Python role.

Ansible roles are essentially a folder structure with YAML files that are used to specify the things necessary for the role. To refactor our plays into a Python role, we just need to create several folders: $PROJECT_HOME/roles/python/tasks, and then place a file called main.yml in that last tasks folder. Copy the list of tasks from our playbook into that file, making sure to unindent them:

- name: Add key for jonathanf PPA
    id: 4AB0F789CBA31744CC7DA76A8CF63AD3F06FC659
    state: present
... etc ...

Afterwards, specify in the playbook which role to apply:

- hosts: all
  become: yes # This means that all tasks will be executed with sudo
    venv_path: "/home/vagrant/venv"
    requirements_path: "/vagrant/requirements.txt"
    - {role: python}

That’s all there’s to it! To make sure everything runs smoothly still, run vagrant provision once more to make sure everything is applied to the VM.

Running Code from PyCharm

Now that we have a provisioned VM ready to go, let’s write some code!

First let’s set up the Python interpreter. Go to File | Settings | Project Interpreter. Then use the gear icon to select ‘Add Remote’, and choose Vagrant. PyCharm automatically detects most settings, we just need to put the path to the Python interpreter to tell PyCharm about the virtualenv we created:

Vagrant Interpreter


Now create a new script, let’s name it and add Flask’s Hello World:

from flask import Flask
app = Flask(__name__)

def hello():
    return "Hello World!"

if __name__ == '__main__':'', debug=True)

Make sure that you use the host='' kwarg, as Flask by default only binds to localhost, and we wouldn’t be able to access our application later.

Now to create a run configuration, just navigate to the script as usual, and select ‘Single instance only’ to prevent the app not starting when the port is already in use:

Basic Flask Run Config

By marking the run configuration as ‘single instance only’ we make sure that we can’t accidentally start the script twice and get a ‘Port already in use’ error.

After saving the run configuration, just click the regular Run or Debug button, and the script should start.

Flask Running

That’s it for today! Stay tuned for the next blog post where we’ll have a look at an application where we build a REST API on top of a database.

December 11, 2017 01:56 PM

Mike Driscoll

PyDev of the Week: Anthony Tuininga

This week we welcome Anthony Tuininga as our PyDev of the Week! Anthony is the creator of the cx_Freeze library among several others in the cx Suite.  You can get a feel for what he’s currently working on over on Github. Let’s take some time to get to know Anthony better!

Can you tell us a little about yourself (hobbies, education, etc):

I grew up in a small town in the central interior of British Columbia, Canada. In spite of it being a small town, my school managed to acquire a personal computer shortly after they were made available. I was fascinated and quickly became the school guru. That experience convinced me that computers and especially programming were in my future. I moved to Edmonton, Alberta, Canada in order to attend university and ended up staying there permanently. Instead of only taking computing science courses I ended up combining them with engineering and received a computer engineering degree. After university I first worked for a small consulting firm, then for a large consulting firm and am now working for the software company, Oracle, in the database group. Besides working with computers I enjoy reading and both cross-country and downhill skiing.

Why did you start using Python?

In the late 1990’s I had developed a C++ library and set of tools to manage Oracle database objects. These worked reasonably well but they took a fair amount of time to both develop and maintain. I discovered Python and its C API and did some experimentation with what eventually became the cx_Oracle Python module. Within a few days I had sufficiently rewritten the C++ library and a couple of the tools using Python and cx_Oracle to prove to myself that Python was an excellent choice. In spite of being interpreted and theoretically slower than C++, the tools I had written in Python were actually faster, primarily due to the fact that I could use more advanced data manipulation techniques in Python with little to no effort compared to C++. I completed the rewrite in Python and continued to expand my use of Python to the point where the flagship product of the company I was working for used it extensively. Thankfully the companies I worked for saw the benefits of the open source model and I was able to make the libraries and tools I developed there available as open source. These include cx_PyGenLib, cx_PyOracleLib, cx_Oracle, cx_OracleTools, cx_OracleDBATools, cx_bsdiff, cx_Logging, ceODBC and cx_Freeze.

What other programming languages do you know and which is your favorite?

I know quite a number of languages to a limited extent as I enjoy experimenting with languages, but the languages I have used regularly over my career are C, C++, SQL, PL/SQL, HTML, JavaScript and Python. Of those, Python is my favorite. I have recently begun experimenting with Go and as a C/C++ replacement it has been a breath of fresh air. Time will tell whether I can find a good use for it, particularly since my current job requires the extensive use of C and that is unlikely to change soon.

What projects are you working on now?

During work hours I am working on a C wrapper for the Oracle Call Interface API called ODPI-C (, cx_Oracle ( and node-oracledb ( Outside of work hours I still do a bit of work on cx_Freeze (

Which Python libraries are your favorite (core or 3rd party)?

The modules I have found to be the most useful in my work have been reportlab (cross platform tool for creating PDFs programmatically), xlsxwriter (cross platform tool for creating Excel documents without requiring Excel itself) and wxPython (cross platform GUI toolkit). I have also recently been making use of the virtues of the venv module (earlier known as virtualenv) and have found it to be excellent for testing.
What was your motivation for creating the cx_Freeze package?
As mentioned earlier I had built a number of tools for managing Oracle database objects. I wanted to distribute these to others without requiring them to install Python itself. I first experimented with the freeze tool that comes with Python itself and found that it worked but wasn’t easy to use or create executables. I discovered py2exe but it was only developed for Windows and we had Linux machines on which we wanted to run these tools. So I built cx_Freeze and it worked well enough that I was able to easily distribute my tools and later full applications on both Windows and Linux and (with some help from the community) macOS. My current job doesn’t require this capability so I have not been able to spend as much time on it as I did before.


What are the top three things that you have learned while maintaining this project?

These lessons have been learned not just with cx_Freeze but also with cx_Oracle, the other well-used module I originally developed. First, code you write that works well for you will break when other people get their hands on it! Everyone thinks differently and makes mistakes differently and that becomes obvious very quickly. Second, although well-written code is the most important aspect of a project (keep in mind lesson #1), documentation, samples and test cases are nearly as important and take almost as much time to do well, and without them others will find your project considerably more difficult to use. Finally, even though additional people bring additional and possibly conflicting ideas, the project is considerably stronger and useful the more contributors there are.

Is there anything else you’d like to say?

I can’t think of anything right now!
Thanks for doing the interview!

December 11, 2017 01:30 PM


PyCharm 2017.3.1 RC

We have a couple of fixes and small improvements for you in PyCharm 2017.3.1, if you’d like to already try them, you can now get the release candidate from the confluence page.

New in this version:

To try this now, get the RC from confluence. You can also update from within PyCharm, just make sure that your update channel is set to ‘EAP’ or ‘RC’ (in Help | Check for Updates).

If you use multiple JetBrains applications, you can use JetBrains Toolbox to make sure all your JetBrains IDE’s stay up to date. PyCharm is also available as a snap package. If you’re on Ubuntu 16.04 or later, you can install PyCharm by using this command:

sudo snap install [pycharm-professional|pycharm-community] --classic --candidate

December 11, 2017 10:39 AM

Full Stack Python

GitPython and New Git Tutorials

First Steps with GitPython is a quick tutorial that shows how to get started using the awesome GitPython library for programmatically interacting with Git repositories in your Python applications. In the spirit of the thank you maintainers issue ticket I wrote about last newsletter, I opened a quick "non-issue" ticket for the GitPython developers to thank them. Give them a thank you +1 if you've used the project and also found it useful.

The Git page on Full Stack Python has also just been updated with new resources. A few of my favorite new tutorials list on the Git page are:

I also split out the Git page resources into beginner, more advanced, specific use case and workflow sections so it's easier to parse based on whether you're a Git veteran or still up-and-coming in that area of your development skills.

Got questions or comments about Full Stack Python? Send me an email or submit an issue ticket on GitHub to let me know how to improve the site as I continue to fill in the table of contents with new pages and new tutorials.

December 11, 2017 05:00 AM


PCA using Python (scikit-learn, pandas)

To understand the value of using PCA for data visualization, the first part of this tutorial post goes over a basic visualization of the IRIS dataset after applying PCA. The second part uses PCA to speed up a machine learning algorithm (logistic regression) on the MNIST dataset.

December 11, 2017 03:19 AM

December 10, 2017

Marius Gedminas

Switching to HTTPS

It’s 2017, so it’s time to make this blog HTTPS-only, with permanent redirects and HSTS headers and everything.

Apologies to any planets that might get flooded when the RSS feed changes all permalinks to https.

(P.S. Hugo is a pain, as I expected. Content-Security-Policy headers are also made of pain, so I’m skipping one for this blog right now.)

December 10, 2017 12:26 PM

Carl Trachte

Powershell Encoded Command, sqlcmd, and csv Query Output

A while back I did a post on using sqlcmd and dumping data to Excel.  At the time I was using Microsoft SQL Server's bcp (bulk copy) utility to dump data to a csv file.

Use of bcp is blocked where I am working now.  But Powershell and sqlcmd are very much available on the Windows workstations we use.  Just as with bcp, smithing text for sqlcmd input can be a little tricky, same with Powershell.  But Powershell has an EncodedCommand feature which allows you to feed input to it as a base 64 string.  This will be a quick demo of the use of this feature and output of a faux comma delimited (csv) file with data.

Disclaimer:  scripts that rely extensively on os.system() calls from Python are indeed hacky and mousetrappy.  I think the saying goes "Necessity is a mother," or something similar.  Onward.

Getting the base 64 string from the original string:

First our SQL code that queries a mock table I made in my mock database:

USE test;

SELECT testpk,
FROM testtable
ORDER BY testpk;

We will call this file selectdata.sql.

Then the call to sqlcmd/Powershell:

sqlcmd -S localhost -i .\selectdata.sql -E -h -1 -s "," -W  | Tee-Object -FilePath .\testoutput

In Python (we have to use Python 2.7 in our environment, so this is Python 2.x specific):

Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> stringx = r'sqlcmd -S localhost -i .\selectdata.sql -E -h -1 -s "," -W | Tee-Object -FilePath .\testoutput'
>>> bytesx = stringx.encode('utf-16-le')
>>> encodedcommandx = base64.b64encode(bytesx)
>>> encodedcommandx

I had to type out my command in the Python interpreter.  When I pasted it in from GVim, it choked on the UTF encoding.

Now, Powershell:
PS C:\Users\ctrachte> $sqlcmdstring = 'sqlcmd -S localhost -i .\selectdata.sql -E -h -1 -s "," -W | Tee-Object -FilePath
PS C:\Users\ctrachte> $encodedcommand = [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($sqlcmdstring))
PS C:\Users\ctrachte> $encodedcommand
PS C:\Users\ctrachte>

OK, the two base 64 strings are the same, so we are good.

Command Execution from os.system() call:

import os
>>> os.system(INVOKEPOWERSHELL.format(encodedcommandx))
Changed database context to 'test'.
(2 rows affected)

And, thanks to Powershell's version of UNIX-like system's tee command, we have a faux csv file as well as output to the command line.

Stackoverflow gave me much of what I needed to know for this:


Powershell's encoded command:

sqlcmd's output to faux csv:

The UTF encoding stuff just took some trial and error and fiddling.

Thanks for stopping by.

December 10, 2017 02:18 AM

December 09, 2017

Wyatt Baldwin

Finding the Oddity in a Sequence

After reading Ned Batchelder’s Iter-tools for puzzles: oddity post yesterday, my first thought was to use itertools.groupby(). That version is the fastest I could come up with (by quite a bit actually, especially for longer sequences), but it requires sorting the iterable first, which uses additional space and won’t work with infinite sequences.

My next thought was to use a set to keep track of seen elements, but that requires the keys to be hashable, so I scrapped that idea.

I figured using a list to keep track of seen elements wouldn’t be too bad if the seen list was never allowed to grow beyond two elements. After playing around with this version for a while, I finally came up with something on par with Ned’s version performance wise while meeting the following objectives:

Some interesting things:

Here’s the code (the gist includes a docstring, tests, and a simplistic benchmark):

def oddity(iterable, key=None, first=False):
    seen = []
    common = []
    uncommon = []

    for item in iter(iterable):
        item_key = key(item) if key else item

        if item_key in seen:
            if item_key in common:
                if first and uncommon:
                    return item_key, uncommon[1]
                if len(common) == 2:
                    raise TooManyCommonValuesError

                if item_key in uncommon:
                    i = uncommon.index(item_key)
                    j = i + 2
                    uncommon[i:j] = []
            if len(seen) == 3:
                raise TooManyDistinctValuesError
            uncommon.extend((item_key, item))

    if len(seen) == 0:
        raise EmptyError

    if len(common) == 0:
        raise NoCommonValueError

    if len(uncommon) == 0:
        uncommon_value = None
        uncommon_value = uncommon[1]

return common[0], uncommon_value

December 09, 2017 11:36 PM

Kracekumar Ramaraju

Debugging Python multiprocessing program with strace

Debugging is a time consuming and brain draining process. It’s essential part of learning and writing maintainable code. Every person has their way of debugging, approaches and tools. Sometimes you can view the traceback, pull the code from memory, and find a quick fix. Some other times, you opt different tricks like the print statement, debugger, and rubber duck method.

Debugging multi-processing bug in Python is hard because of various reasons.

Let’s say a program reads a metadata file which contains a list of JSON files and total records. Files may have a total of 100 JSON records, but you may need only 5. In any case, the function can return first five or random five records.

Sample code

The code is just for demonstrating the workflow and production code is not simple like above one.

Consider multiprocessing.Pool.starmap a function call with read_data as target and number of processes as 40.

Let’s say there are 80 files to process. Out of 80, 5 are problematic files(function takes ever to complete reading the data). Whatever be the position of five files, the processes continues forever, while other processes enter sleep state after completing the task.

Once you know the PID of the running process, you can look what system calls process calls using strace -s $PID. The process in running state was calling the system call read with same file name again and again. The while loop went on since the file had zero records and had only one file in the queue.

Strace looks like one below

You may argue a well-placed print may solve the problem. All times, you won’t have the luxury to modify the running program or replicate the code in the local environment.

December 09, 2017 07:19 PM

Weekly Python StackOverflow Report

(ciii) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2017-12-09 07:59:50 GMT

  1. in operator, float("NaN") and np.nan - [16/2]
  2. Multiplying a np.int8 array with 127 yields different numpy array types depending on platform - [15/1]
  3. How to split a list on an element delimiter - [9/2]
  4. Length of the longest sub-array which consists of all '1' - [6/3]
  5. Reordering nodes in increasing order in pandas dataframe - [6/3]
  6. Calling `super()` in parent class - [6/2]
  7. Python: Modify the internal behavior of a function by using decorator - [6/1]
  8. Re-creating a python invocation - [6/1]
  9. How generate all pairs of values, from the result of a groupby, in a pandas dataframe - [5/3]
  10. Python - Confused about inheritance - [5/1]

December 09, 2017 08:00 AM

December 08, 2017


Advent of Code 2017 #8 and Python's eval

I'm having fun solving Advent of Code 2017. Problem 8 reminded the power of Python's eval (and before you start commenting the "eval is evil" may I remind you of this :)

You can check The Go implementation that don't have eval and need to work harder.

December 08, 2017 07:37 PM