skip to navigation
skip to content

Planet Python

Last update: August 15, 2020 10:47 AM UTC

August 15, 2020


Python Bitwise Operators

Learn about bitwise operators and bit manipulation in Python.

August 15, 2020 08:09 AM UTC

Python Circle

Server Access Logging in Django using middleware

Creating access logs in Django application, Logging using middleware in Django app, Creating custom middleware in Django, Server access logging in Django, Server Access Logging in Django using middleware

August 15, 2020 07:44 AM UTC

Adding Robots.txt file to Django Application

Adding robots.txt file in your Django application, Easiest way to add robots.txt file in Django, Django application robots.txt file, Why should you add robots.txt file in your Django Application,

August 15, 2020 07:44 AM UTC

August 14, 2020


Freezegun - Real Joy for Fake Dates in Python


If you've ever tested code involving dates and times in Python you've probably had to mock the datetime module. And if you've mocked the datetime module, at some point it probably mocked you back when your tests failed.

Icelandic horse mocks you

Photo by Dan Cook on Unsplash

Let's look at some problems that pop up with fake datetimes, and how freezegun can help address them.

A simple test

First, here's a snippet of date-sensitive code:

import datetime

def tomorrow():
    return + datetime.timedelta(days=1)

When we test that function, we probably want the result to be the same every day. One way to handle that is by using a fake date in the test:

import datetime
from unittest.mock import patch, Mock

import tomorrow

fake_date = Mock( =, 7, 2)

@patch('', fake_date)
def test_tomorrow():
    assert tomorrow.tomorrow() ==, 7, 3)

That works, great! Now let's break it.

A catalog of failures

Let's say that during a refactor, we change this:

import datetime

def tomorrow():
    return + datetime.timedelta(days=1)

to this:

from datetime import date, timedelta

def tomorrow():
    return + timedelta(days=1)

or perhaps this:

from datetime import date as dt, timedelta as td

def tomorrow():
    return + td(days=1)

Both changes cause the test to fail, even though there is no functional change. Why?

With from datetime import date, timedelta, the code under test gets a reference to the unpatched at import time. By the time the test runs, its @patch has no practical effect.

Following the advice in "Where to patch", we could get the test working again by patching our own code rather than the builtin datetime module:

@patch('', fake_date)
def test_tomorrow():
    assert tomorrow.tomorrow() ==, 7, 3)

That wouldn't cover the second change though - we'd need to also patch tomorrow.dt for that. Just a couple examples make this test start to feel brittle and tightly coupled to the implementation.

We can do better though.

A more resilient test

We're looking for a test that can verify the behavior of our code even if subtle implementation details change. Aside from import variations, we've got different ways to fetch the current date (,, datetime.utcnow(), et cetera). By starting with the simplest test possible and working toward flexibility, it's possible to end up with something like this:

fake_dttm = Mock(wraps=datetime) =, 7, 2) = datetime.datetime(2020, 7, 2)
fake_dttm.datetime.utcnow.return_value = datetime.datetime(2020, 7, 2)

@patch('',, create=True)
@patch('tomorrow.datetime', fake_dttm, create=True)
@patch('tomorrow.dt', fake_dttm, create=True)
def test_tomorrow():
    assert tomorrow.tomorrow() ==, 7, 3)

Which feels like an uncanny valley between thoroughness and pragmatism.

For a PyBites code challenge , the tests carry an extra consideration. You write tests for an "official solution", but the tests also run against user-submitted code. So it shouldn't matter if one person uses import datetime as dt while another opts for from datetime import date as dt, timedelta as td.

There are a few different ways to tackle this. I list some references at the end of this post, but for now we'll look at the freezegun library.

Adding freezegun to our tests

Freezegun provides a freeze_time() decorator that we can use to set a fixed date for our test functions. Picking up from the last section, that helps evolve this:

fake_date = Mock( =, 7, 2)

@patch('', fake_date)
def test_tomorrow():
    assert tomorrow.tomorrow() ==, 7, 3)

into this:

from freezegun import freeze_time

def test_tomorrow():
    assert tomorrow.tomorrow(include_time=True) == datetime.datetime(2020, 7, 3)

There are a few neat things going on there. For one, we can use a friendly date string (courtesy of the excellent dateutil library) rather than a handcrafted date or datetime object. That gets even more useful when we're dealing with full timestamps rather than dates.

It's also useful (but less immediately clear) that freeze_time patches a number of common datetime methods under the hood:

Because of how freezegun patches those methods, we can guarantee seeing the frozen date as long as the code under test uses those builtin methods. So our tests will smoothly handle import variations like:

Deep and thorough fakes

Freezegun keeps your tests simple by faking Python datetimes thoroughly. Not only does it patch methods from datetime and time, it looks at existing imports so it can be sure they're patched too. For anyone interested in the details, this section of the freezegun API code is a fine read!

If you're rolling your own fakes, you're not likely to be as thorough as freezegun. You probably don't need to be! But for at least some cases, a library like freezegun can offer more thorough tests that are also simpler and more readable.

Taking it further

For high volume, performance-sensitive tests with fake dates, libfaketime may be worth a look. Additionally, there are pytest plugins available for both freezegun and libfaketime.


This post was a pretty narrowly-focused look at some common issues that pop up when testing with fake dates. I'm not an expert on any of this stuff, but I've been inspired by folks who know it better. So if you found this post interesting, some of these resources may also be worth a look:

And since we're talking about dates and times here, I can't help including Falsehoods programmers believe about time.


Thanks to the PyBites community for inspiring this post. Notably:

And of course, thanks to Steve Pulec for creating freezegun. (While I'm at it, thanks for moto too!)

Keep calm and code in Python!

-- AJ

August 14, 2020 05:10 PM UTC


Return how many times each letter shows up in the string by using an asterisk (*)

Hello people, in this article we will solve the below python problem.

You receive the name of a city as a string, and you need to return a string that shows how many times each letter shows up in the string by using an asterisk (*).

For example:

“Chicago” –> “c:*,h:,i:,a:,g:,o:

As you can see, the letter c is shown only once, but with 2 asterisks.

The return string should include only the letters (not the dashes, spaces, apostrophes, etc). There should be no spaces in the output, and the different letters are separated by a comma (,) as seen in the example above.

Note that the return string must list the letters in order of their first appearance in the original string.

I will solve the above problem using just a pure/core python module without importing any extra module.

def get_strings(city):

    city = "".join(city.lower().split()) # get rid of all spaces and turn all characters into lower case characters

    city_list = list(city) # next turn the city string into a list

    already = [] # this is used to hold the character which has already counted before

    city_string = '' # this is the city string which will be returned

    for word in city_list:

        if word not in already:

            count = city_list.count(word)
            city_string += word + ":"

            for i in range(0, count):
                city_string += "*"

            city_string+= ","

    return city_string[:-1] # don't forget to get rid of the last comma
Save Up to 88% Off + Free Gift. Limited Time Offer

Do you like the above solution or do you have a better one? Let me know in the comment box below this post!

August 14, 2020 12:28 PM UTC

Real Python

The Real Python Podcast – Episode #22: Create Cross-Platform Python GUI Apps With BeeWare

Do you want to distribute your Python applications to other users who don't have or even use Python? Maybe you're interested in seeing your Python application run on iOS or Android mobile devices. This week on the show we have Russell Keith-Magee, the founder and maintainer of the BeeWare project. Russell talks about Briefcase, a tool that converts a Python application into native installers on macOS, Windows, Linux, and mobile devices.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 14, 2020 12:00 PM UTC


Python vs R: Which is Good for Machine Learning?

Machine learning has become the talk of the town. Let us see whether Python or R is good for machine learning.

August 14, 2020 11:29 AM UTC

Django Weblog

DjangoCon Australia 2020: Schedule live and tickets on sale 🎟️

The 8th DjangoCon AU was scheduled to be run in Adelaide, South Australia this year. It's been moved to an online event and will take place on September 4th.

DjangoCon AU is organized as a specialist track as part of PyConline AU. The schedule — though shorter than in previous years — is packed with talks about best practices, communities, contributions, and the present and future of Django.

Since the event was due to run in Adelaide, the event is running in Australian Central Standard Time, UTC+9:30, and DjangoCon AU will start at 3:45pm ACST. This link shows when the DjangoCon AU Opening address starts for all the DjangoCon timezones..

Tickets are now available. The prices are AU$29 for enthusiasts and AU$79 for professionals. All tickets grant access to all of PyConline AU 2020. We also have a discount for student attendees at AU$10. And for those who want to help financially support the conference, contributor tickets start at AU$300. More details are available on the PyConline AU website.

We hope to see you online in 3 weeks!

Katie McLaughlin, Markus Holtermann, DjangoCon AU organizers

August 14, 2020 07:00 AM UTC

Justin Mayer

Python Development Environment on macOS Mojave & High Sierra

While installing Python and Virtualenv on macOS Mojave & High Sierra can be done several ways, this tutorial will guide you through the process of configuring a stock Mac system into a solid Python development environment.

First steps

This guide assumes that you have already installed Homebrew. For details, please follow the steps in the macOS Configuration Guide.


We are going to install the latest version of Python via asdf and its Python plugin. Why bother, you ask, when Apple includes Python along with macOS? Here are some reasons:

Use the following command to install asdf via Homebrew:

brew install asdf

Next we ensure asdf is loaded for both current and future shell sessions. If you are using Fish shell:

source (brew --prefix)/opt/asdf/

echo source (brew --prefix)/opt/asdf/ >> ~/.config/fish/

For Bash (the default shell on macOS up to and including Mojave):

. $(brew --prefix asdf)/
echo -e "\n. $(brew --prefix asdf)/" >> ~/.bash_profile

For Zsh (the default shell on Catalina and Big Sur):

. $(brew --prefix asdf)/
echo -e "\n. $(brew --prefix asdf)/" >> ~/.zshrc

Install the asdf Python plugin and the latest version of Python:

asdf plugin add python
asdf install python latest

Note the Python version number that was just installed. For the purpose of this guide, we will assume version 3.8.5, so replace that number below with the version number you actually just installed.

Set the default global Python version:

asdf global python 3.8.5

Confirm the Python version matches the latest version we just installed:

python --version


Let’s say you want to install a Python package, such as the Virtualenv environment isolation tool. While nearly every Python-related article for macOS tells the reader to install it via sudo pip install virtualenv, the downsides of this method include:

  1. installs with root permissions
  2. installs into the system /Library
  3. yields a less reliable environment when using Python built with asdf

As you might have guessed by now, we’re going to use the asdf Python plugin to install the Python packages that we want to be globally available. When installing via python -m pip […], packages will be installed to: ~/.asdf/installs/python/{version}/lib/python{version}/site-packages/

First, let’s ensure we are using the latest version of Pip and Setuptools:

python -m pip install -U pip setuptools

In the next section, we’ll use Pip to install our first globally-available Python package.


Python packages installed via Pip are global in the sense that they are available across all of your projects. That can be convenient at times, but it can also create problems. For example, sometimes one project needs the latest version of Django, while another project needs an older Django version to retain compatibility with a critical third-party extension. This is one of many use cases that Virtualenv was designed to solve. On my systems, only a handful of general-purpose Python packages (including Virtualenv) are globally available — every other package is confined to virtual environments.

With that explanation behind us, let’s install Virtualenv:

python -m pip install virtualenv
asdf reshim python

Create some directories to store our projects, virtual environments, and Pip configuration file, respectively:

mkdir -p ~/Projects ~/Virtualenvs ~/Library/Application\ Support/pip

We’ll then open Pip’s configuration file (which may be created if it doesn’t exist yet)…

vim ~/Library/Application\ Support/pip/pip.conf

… and add some lines to it:

require-virtualenv = true

require-virtualenv = true

Now we have Virtualenv installed and ready to create new virtual environments, which we will store in ~/Virtualenvs. New virtual environments can be created via:

cd ~/Virtualenvs
virtualenv foobar

If you have both Python 3.7.x and 3.8.x installed and want to create a Python 3.7.8 virtual environment:

virtualenv -p ~/.asdf/installs/python/3.7.8/bin/python foobar-py3.7.8

Restricting Pip to virtual environments

What happens if we think we are working in an active virtual environment, but there actually is no virtual environment active, and we install something via python -m pip install foobar? Well, in that case the foobar package gets installed into our global site-packages, defeating the purpose of our virtual environment isolation.

Thankfully, Pip has an undocumented setting (source) that tells it to bail out if there is no active virtual environment, which is exactly what we want. In fact, we’ve already set that above, via the require-virtualenv = true directive in Pip’s configuration file. For example, let’s see what happens when we try to install a package in the absence of an activated virtual environment:

python -m pip install markdown
Could not find an activated virtualenv (required).

Perfect! But once that option is set, how do we install or upgrade a global package? We can temporarily turn off this restriction by defining a new function in ~/.bashrc:

   PIP_REQUIRE_VIRTUALENV="0" python -m pip "$@"

(As usual, after adding the above you must run source ~/.bash_profile for the change to take effect.)

If in the future we want to upgrade our global packages, the above function enables us to do so via:

gpip install --upgrade pip setuptools virtualenv

You could achieve the same effect via PIP_REQUIRE_VIRTUALENV="0" python -m pip install --upgrade […], but that’s much more cumbersome to type every time.

Creating virtual environments

Let’s create a virtual environment for Pelican, a Python-based static site generator:

cd ~/Virtualenvs
virtualenv pelican

Change to the new environment and activate it via:

cd pelican
source bin/activate

To install Pelican into the virtual environment, we’ll use Pip:

python -m pip install pelican markdown

For more information about virtual environments, read the Virtualenv docs.


These are obviously just the basic steps to getting a Python development environment configured. Feel free to also check out my dotfiles.

If you found this article to be useful, feel free to follow me on Twitter.

August 14, 2020 06:00 AM UTC

PSF GSoC students blogs

Week 6 Blog Post

What I have done this week

1. Modified the PR on split multistage dockerfile. The function works fine so far.

2. Filed a draft PR on building and analyzing the multistage dockerfile. This PR is used to test the feasibility and needs modifications. So far we can get the report on each stage. More tests will be run on this function.

Plan on next week

1. More tests on the draft PR and send feedbacks to mentors.

2. Try dockerfile lock.


I am not sure if this is the best way to implement analysis on multistage dockerfile. But at least this should work.

August 14, 2020 01:39 AM UTC

August 13, 2020

Deploying Django to AWS ECS with Terraform

In this tutorial, we'll look at how to deploy a Django app to AWS ECS with Terraform.

August 13, 2020 10:28 PM UTC

Davy Wybiral

DIY Solar Powered LoRa Repeater (with Arduino)

In today's video I be built a solar powered LoRa signal repeater to extend the range of my LoRa network. This can easily be used as the basis for a LoRa mesh network with a bit of extra code and additional repeaters.

Even if you're not into LoRa networks all of the solar power hardware in this video can be used for any off-the-grid electronics projects or IoT nodes!


August 13, 2020 08:06 PM UTC

Python Engineering at Microsoft

Python in Visual Studio Code – August 2020 Release

We are pleased to announce that the August release of the Python Extension for Visual Studio Code is now available. You can download the Python extension from the Marketplace, or install it directly from the extension gallery in Visual Studio Code. If you already have the Python extension installed, you can also get the latest update by restarting Visual Studio Code. You can learn more about  Python support in Visual Studio Code in the documentation.  

In this release we addressed a total of 38 issues, and it includes:  

If you’re interested, you can check the full list of improvements iour changelog. 

Support for multiple Python interactive windows 

We’re excited to announce that you can now start multiple Python interactive windows with the Python extension! This was one of the most requested features on the Python in VS Code GitHub repo 

By default, every time you run the “Python: Create Python Interactive Window” command in the command palette (View Command Palette…), it will create a new interactive window in VS Code:  

Multiple interactive windows being created when running the

Code cells from Python scripts by default will still be executed in a same interactive window. However, you can now configure the Python extension to run separate files in separate interactive windows.  Just open the settings page (File Preferences Settings)search for “interactive window mode” and change the setting value to “perFile”.  

Interactive Window Monde settings options (perFile, single and multiple).

Now when you run cells from different files, they will each run on their own separate window: 

Running cells from each file into separate Python interactive windows.

If you would like to remain with the single interactive window behavior, you can set the value of the interactive window mode to “single”.  

Pylance as an officially supported language server value

This release includes support to officially add Pylance as a supported value in our python.languageServer setting. You can now set Pylance via the settings editor UI in Visual Studio Code. If you haven’t already installed Pylance, you can download it from the marketplace or simply set this value and we will prompt you to install it! If you missed the announcement about our new Pylance language server, you can read more about it here. 

Configuration options for Language Server setting (Jedi, Pylance, Microsoft, None)  

Improved signature help for overloaded functions in Pylance

Pylance has also improved how it displays signature help when you are invoking a function with multiple overrides. You can now navigate between signatures easily while Pylance bolds the appropriate active parameter.  

Smart signature help with pylance.

Other Changes and Enhancements 

We have also added enhancements and fixed issues requested by users that should improve your experience working with Python in Visual Studio Code. Some notable changes include: 

We’re constantly A/B testing new features. If you see something different that was not announced by the team, you may be part of the experiment! To see if you are part of an experiment, you can check the first lines in the Python extension output channel. If you wish to opt-out of A/B testing, you can open the user settings.json file (View Command Palette… and run Preferences: Open Settings (JSON)) and set the “python.experiments.enabled” setting to false 

Be sure to download the Python extension for Visual Studio Code now to try out the above improvements. If you run into any problems or have suggestionsplease file an issue on the Python VS Code GitHub page. 

The post Python in Visual Studio Code – August 2020 Release appeared first on Python.

August 13, 2020 05:07 PM UTC


Learn Any Programming Language with This Learning Plan

In the next couple of minutes, I will be showing you how to create the perfect learning plan that will help you learn just about any programming language you need to acquire skills in. I know this may sound too simplistic, but it has been tried and tested over a couple of years.

August 13, 2020 04:16 PM UTC


EuroPython 2020: Live Stream Recordings available

We’re happy to announce the public availability of the live stream recordings from EuroPython 2020. They were already available to all conference attendees since the sprint days.


EuroPython YouTube Channel

We have collected the videos in a EuroPython 2020 Live Stream playlist.

Unedited Videos

What we are releasing today are unedited videos recorded for the main track rooms and days. The poster track recordings will be added today or tomorrow.

You can use the schedule to navigate the videos. Linking to a specific time in the videos can be done by right-clicking in the video to create a URL which points to the current position:


Feel free to share interesting links on social media.

Edited Videos

Our video editing company is already busy creating the cut videos. Those should be ready in a month or two.


EuroPython 2020 Team 

August 13, 2020 08:34 AM UTC

Python Circle

How to use AJAX with Django

How to use AJAX in Django projects?, Checking username availability without submitting form, Making AJAX calls from Django code, loading data without refreshig page in django templates, AJAX and Django,

August 13, 2020 04:44 AM UTC

How to create management commands in Django

creating custom management commands in Django application, Background tasks in Django App, Scheduled tasks in Django, How to schedule a task in Django application, How to create and schedule a cron in Django

August 13, 2020 04:44 AM UTC

PSF GSoC students blogs

Week 11 Check in!

Hi everyone

Now we are into the second last week of the Official Coding period for GSoC'20 ! The previous week was basically me involved in completing the sample code for navigation.
We selected a new environment for the sample with a better license. The previous code of the sample program was almost completely removed and a fresh code was written explaining a lot more features of the navigation and navmeshgen libraries.
Raycasting has been implemented in order to find the destination based on the position of a mouse click. So now if a mouse is clicked, then the two-dimensional position of the cursor is then used to find the corresponding three-dimensional position in the panda3d world coordinates, which is then set as new destination. Now the user can change the destination and find a new path even while an actor is already traveling on the previous path. The current position of the actor in the middle of the path will be set as starting point.

The GIF above shows the dynamic path-finding ability of the sample program. Yellow lines are the output of find_straight_path() while the green ones are output of find_path(). The panda has been programmed to move on the yellow line.

This was all for the previous week. This week I am focusing on the PR reviews so that we can have the PR merged as soon as possible.

Thank you. Stay Safe!

August 13, 2020 01:37 AM UTC

Weekly Check In - 10

What did I do till now?

I started implementing the CONNECT method for Tunneling via HTTP/2. After a lot of testing, I realized the approach I was taking was not really feasible, hence next I plan to work on an approach which initially uses HTTP/1.1 CONNECT to establish a connection with the proxy and then shifts to HTTP/2 for all the requests made via proxy. 

What's coming up next? 

Next week, I plan to

Did I get stuck anywhere?

Yes, this week I had many problems while adding support for tunneling for proxies. I have planned completely another approach for next week using HTTP/1.1 and HTTP/2. Let's see how it goes :) 

August 13, 2020 12:38 AM UTC

Matt Layman

Rendering Calendars - Building SaaS #68

In this episode, I worked on rendering a calendar of important events in a school year. We built out the appropriate data structures, and I wrote some new model methods and added tests. On the last stream, I created a new model to track breaks in the school year. The app now shows the calendar for the school year, and I want to display the breaks on the calendar. Before digging too far into the code, I provided my thoughts about using Docker for development from a question that came from the chat.

August 13, 2020 12:00 AM UTC

August 12, 2020

PyPy Development

A new chapter for PyPy

PyPy winds down its membership in the Software Freedom Conservancy

Conservancy and PyPy's great work together

PyPy joined Conservancy in the second half of 2010, shortly after the release of PyPy 1.2, the first version to contain a fully functional JIT. In 2013, PyPy started supporting ARM, bringing its just-in-time speediness to many more devices and began working toward supporting NumPy to help scientists crunch their numbers faster. Together, PyPy and Conservancy ran successful fundraising drives and facilitated payment and oversight for contractors and code sprints.

Conservancy supported PyPy's impressive growth as it expanded support for different hardware platforms, greatly improved the performance of C extensions, and added support for Python 3 as the language itself evolved.

The road ahead

Conservancy provides a fiscal and organizational home for projects that find the freedoms and guardrails that come along with a charitable home advantageous for their community goals. While this framework was a great fit for the early PyPy community, times change and all good things must come to an end.

PyPy will remain a free and open source project, but the community's structure and organizational underpinnings will be changing and the PyPy community will be exploring options outside of the charitable realm for its next phase of growth ("charitable" in the legal sense -- PyPy will remain a community project).

During the last year PyPy and Conservancy have worked together to properly utilise the generous donations made by stalwart PyPy enthusiats over the years and to wrap up PyPy's remaining charitable obligations. PyPy is grateful for the Conservancy's help in shepherding the project toward its next chapter.

Thank yous

From Conservancy:

"We are happy that Conservancy was able to help PyPy bring important software for the public good during a critical time in its history. We wish the community well and look forward to seeing it develop and succeed in new ways."
— Karen Sandler, Conservancy's Executive Director

From PyPy:

"PyPy would like to thank Conservancy for their decade long support in building the community and wishes Conservancy continued success in their journey promoting, improving, developing and defending free and open source sofware."

— Simon Cross & Carl Friedrich Bolz-Tereick, on behalf of PyPy.


PyPy is a multi-layer python interpreter with a built-in JIT compiler that runs Python quickly across different computing environments. Software Freedom Conservancy (Conservancy) is a charity that provides a home to over forty free and open source software projects.

August 12, 2020 08:00 PM UTC

PSF GSoC students blogs

Week 9

Fixed #392 and submitted PR. Moving on to #363 and #392

August 12, 2020 05:46 PM UTC

Doug Hellmann

sphinxcontrib-spelling 5.2.1

sphinxcontrib-spelling is a spelling checker for Sphinx-based documentation. It uses PyEnchant to produce a report showing misspelled words. Bug Fixes Updated to only create .spelling output files for inputs that generate spelling warnings. Fixes #63. Details update documentation with example output log files as they are created add separate spelling target in tox only create …

August 12, 2020 03:49 PM UTC

Real Python

Python Community Interview With Bruno Oliveira

Today I’m joined by Bruno Oliveira, who is perhaps most well known for being a pytest core developer. In this interview, we cover migrating a large codebase from C++ to Python, how to get started with pytest, and his love of Dark Souls.

Ricky: Welcome to Real Python, Bruno. I’m glad you could join us. Let’s start in the same manner we do with all our guests: How’d you get into programming, and when did you start using Python?

Bruno: Hi, Ricky. Thanks for having me. Bruno Oliveira

I started programming twenty-three years ago or so. I was just getting into computers when a friend of mine showed me this book about Visual Basic, and I was absolutely amazed that you could write your own calculator.

Sometime later, my dad bought me the Delphi 3 Bible. I devoured that, and after a while I started to program in DirectX and make very simple games.

After that, I went into college, and in the second semester I managed to get an internship to work on a Delphi application for image processing. After the internship, I joined ESSS, which is the company I work at to this day.

My role at ESSS as a technical leader is to manage the technical aspects of the projects I work on, including design and day-to-day code reviews. I’m involved in four projects currently.

Along with the other technical leaders, I also develop and oversee our high-level efforts, such as migrating all the code from one Python version to the next, solving problems in our CI, implementing development workflows, and many others.

At ESSS, we develop engineering applications specifically for the oil and gas industry. When I joined, everything was coded in C++, and the team developed everything themselves, including a multiplatform GUI library. A year or so after I was hired, we started looking into Python (version 2.4 at the time) and how we could use it together with PyQt for quickly developing our applications.

Read the full article at »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

August 12, 2020 02:00 PM UTC

Stack Abuse

Deep Learning in Keras - Building a Deep Learning Model


Deep learning is one of the most interesting and promising areas of artificial intelligence (AI) and machine learning currently. With great advances in technology and algorithms in recent years, deep learning has opened the door to a new era of AI applications.

In many of these applications, deep learning algorithms performed equal to human experts and sometimes surpassed them.

Python has become the go-to language for Machine Learning and many of the most popular and powerful deep learning libraries and frameworks like TensorFlow, Keras, and PyTorch are built in Python.

In this series, we'll be using Keras to perform Exploratory Data Analysis (EDA), Data Preprocessing and finally, build a Deep Learning Model and evaluate it.

In this stage, we will build a deep neural-network model that we will train and then use to predict house prices.

Defining the Model

A deep learning neural network is just a neural network with many hidden layers.

Defining the model can be broken down into a few characteristics:

Deep Learning Layers

There are many types of layers for deep learning models. Convolutional and pooling layers are used in CNNs that classify images or do object detection, while recurrent layers are used in RNNs that are common in natural language processing and speech recognition.

We'll be using Dense and Dropout layers. Dense layers are the most common and popular type of layer - it's just a regular neural network layer where each of its neurons is connected to the neurons of the previous and next layer.

Each dense layer has an activation function that determines the output of its neurons based on the inputs and the weights of the synapses.

Dropout layers are just regularization layers that randomly drop some of the input units to 0. This helps in reducing the chance of overfitting the neural network.

Activation Functions

There are also many types of activation functions that can be applied to layers. Each of them links the neuron's input and weights in a different way and makes the network behave differently.

Really common functions are ReLU (Rectified Linear Unit), the Sigmoid function and the Linear function. We'll be mixing a couple of different functions.

Input and Output layers

In addition to hidden layers, models have an input layer and an output layer:

deep learning neural network architecture

The number of neurons in the input layer is the same as the number of features in our data. We want to teach the network to react to these features. We have 67 features in the train_df and test_df dataframes - thus, our input layer will have 67 neurons. These will be the entry point of our data.

For the output layer - the number of neurons depends on your goal. Since we're just predicting the price - a single value, we'll use only one neuron. Classification models would have class-number of output neurons.

Since the output of the model will be a continuous number, we'll be using the linear activation function so none of the values get clipped.

Defining the Model Code

With these parameters in mind, let's define the model using Keras:

model = keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=[train_df.shape[1]]),
    layers.Dropout(0.3, seed=2),
    layers.Dense(64, activation='swish'),
    layers.Dense(64, activation='relu'),
    layers.Dense(64, activation='swish'),
    layers.Dense(64, activation='relu'),
    layers.Dense(64, activation='swish'),

Here, we've used Keras' Sequential() to instantiate a model. It takes a group of sequential layers and stacks them together into a single model. Into the Sequential() constructor, we pass a list that contains the layers we want to use in our model.

We've made several Dense layers and a single Dropout layer in this model. We've made the input_shape equal to the number of features in our data. We define that on the first layer as the input of that layer.

There's 64 neurons in each layer. This is typically up to testing - putting in more neurons per layer will help extract more features, but these can also sometimes work against you. After some testing, 64 neurons per layer in this example produced a fairly accurate result. It's highly encouraged to play around with the numbers!

We've quickly dropped 30% of the input data to avoid overfitting. The seed is set to 2 so we get more reproducible results. If we just totally randomly dropped them, each model would be different.

Finally, we have a Dense layer with a single neuron as the output layer. By default, it has the linear activation function so we haven't set anything.

Compiling the Model

After defining our model, the next step is to compile it. Compiling a Keras model means configuring it for training.

To compile the model, we need to choose:

With those in mind, let's compile the model:

optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.001)


Here, we've created an RMSprop optimizer, with a learning rate of 0.001. Feel free to experiment with other optimizers such as the Adam optimizer.

Note: You can either declare an optimizer and use that object or pass a string representation of it in the compile() method.

We've set the loss function to be Mean Squared Error. Again, feel free to experiment with other loss functions and evaluate the results. Since we have MSE as the loss function, we've opted for Mean Absolute Error as the metric to evaluate the model with.

Training the Model

After compiling the model, we can train it using our train_df dataset. This is done by fitting it via the fit() function:

history =
    train_df, train_labels,
    epochs=70, validation_split=0.2

Here, we've passed the training data (train_df) and the train labels (train_labels).

Also, learning is an iterative process. We've told the network to go through this training dataset 70 times to learn as much as it can from it. The models' results in the last epoch will be better than in the first epoch.

Finally, we pass the training data that's used for validation. Specifically, we told it to use 0.2 (20%) of the training data to validate the results. Don't confuse this with the test_df dataset we'll be using to evaluate it.

The 20% will not be used for training, but rather for validation to make sure it makes progress.

This function will print the results of each epoch - the value of the loss function and the metric we've chosen to keep track of.

Once finished, we can take a look at how it's done through each epoch:

Epoch 65/70
59/59 [==============================] - 0s 2ms/step - loss: 983458944.0000 - mae: 19101.9668 - val_loss: 672429632.0000 - val_mae: 18233.3066
Epoch 66/70
59/59 [==============================] - 0s 2ms/step - loss: 925556032.0000 - mae: 18587.1133 - val_loss: 589675840.0000 - val_mae: 16720.8945
Epoch 67/70
59/59 [==============================] - 0s 2ms/step - loss: 1052588800.0000 - mae: 18792.9805 - val_loss: 608930944.0000 - val_mae: 16897.8262
Epoch 68/70
59/59 [==============================] - 0s 2ms/step - loss: 849525312.0000 - mae: 18392.6055 - val_loss: 613655296.0000 - val_mae: 16914.1777
Epoch 69/70
59/59 [==============================] - 0s 2ms/step - loss: 826159680.0000 - mae: 18177.8945 - val_loss: 588994816.0000 - val_mae: 16520.2832
Epoch 70/70
59/59 [==============================] - 0s 2ms/step - loss: 920209344.0000 - mae: 18098.7070 - val_loss: 571053952.0000 - val_mae: 16419.8359

After training, the model (stored in the model variable) will have learned what it can and is ready to make predictions. fit() also returns a dictionary that contains the loss function values and mae values after each epoch, so we can also make use of that. We've put that in the history variable.

Before making predictions, let's visualize how the loss value and mae changed over time:

model_history = pd.DataFrame(history.history)
model_history['epoch'] = history.epoch

fig, ax = plt.subplots(figsize=(14,8))
num_epochs = model_history.shape[0]
ax.plot(np.arange(0, num_epochs), model_history["mae"], 
        label="Training MAE", lw=3, color='#f4b400')
ax.plot(np.arange(0, num_epochs), model_history["val_mae"], 
        label="Validation MAE", lw=3, color='#0f9d58')

mae and loss function over time and training

We can clearly see both the mae and loss values go down over time. This is exactly what we want - the model got more accurate with the predictions over time.

Making Predictions with the Model

Now that our model is trained, let's use it to make some predictions. We take an item from the test data (in test_df):

test_unit = test_df.iloc[[0]]

This item stored in test_unit has the following values, cropped at only 7 entries for brevity:

Lot Frontage Lot Area Overall Qual Overall Cond Year Built Total Bsmt SF 1st Flr SF
14 0.0157117 -0.446066 1.36581 -0.50805 0.465714 1.01855 0.91085

These are the values of the feature unit and we'll use the model to predict its sale price:

test_pred = model.predict(test_unit).squeeze()

We used the predict() function of our model, and passed the test_unit into it to make a prediction of the target variable - the sale price.

Note: predict() returns a NumPy array so we used squeeze(), which is a NumPy function to "squeeze" this array and get the prediction value out of it as a number, not an array.

Now, let's get the actual price of the unit from test_labels:

test_lbl = test_labels.iloc[0]

And now, let's compare the predicted price and the actual price:

print("Model prediction = {:.2f}".format(test_pred))
print("Actual value = {:.2f}".format(test_lbl))
Model prediction = 225694.92
Actual value = 212000.00

So the actual sale price for this unit is $212,000 and our model predicted it to be ​*$225,694*. That's fairly close, though the model overshot the price ~5%.

Let's try another unit from test_df:

test_unit = test_df.iloc[[100]]

And we'll repeat the same process to compare the prices:

test_pred = model.predict(test_unit).squeeze()
test_lbl = test_labels.iloc[100]
print("Model prediction = {:.2f}".format(test_pred))
print("Actual value = {:.2f}".format(test_lbl))
Model prediction = 330350.47
Actual value = 340000.00

So for this unit, the actual price is $340,000 and the predicted price is ​*$330,350*. Again, not quite on point, but it's an error of just ~3%. That's very accurate.

Evaluating the Model

This is the final stage in our journey of building a Keras deep learning model. In this stage we will use the model to generate predictions on all the units in our testing data (test_df) and then calculate the mean absolute error of these predictions by comparing them to the actual true values (test_labels).

Keras provides the evaluate() function which we can use with our model to evaluate it. evaluate() calculates the loss value and the values of all metrics we chose when we compiled the model.

We chose MAE to be our metric because it can be easily interpreted. MAE value represents the average value of model error:
\text{MAE}(y, \hat{y}) = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|.

For our convenience, the evaluate() function takes care of this for us:

loss, mae = model.evaluate(test_df, test_labels, verbose=0)

To this method, we pass the test data for our model (to be evaluated upon) and the actual data (to be compared to). Furthermore, we've used the verbose argument to avoid printing any additional data that's not really needed.

Let's run the code and see how it does:

print('MAE = {:.2f}'.format(mae))
MAE = 17239.13

The mean absolute error is 17239.13. That's to say, for all units, the model on average predicted $17,239 above or below the actual price.

Interpretation of Model Performance

How good is that result? If we look back at the EDA we have done on SalePrice, we can see that the average sale price for the units in our original data is $180,796. That said, a MAE of 17,239 is fairly good.

To interpret these results in another way, let's plot the predictions against the actual prices:

test_predictions_ = model.predict(test_df).flatten()
test_labels_ = test_labels.to_numpy().flatten()
fig, ax = plt.subplots(figsize=(14,8))
plt.scatter(test_labels_, test_predictions_, alpha=0.6, 
            color='#ff7043', lw=1, ec='black')
lims = [0, max(test_predictions_.max(), test_labels_.max())]
plt.plot(lims, lims, lw=1, color='#00acc1')

actual vs predicted values

If our model was 100% accurate with 0 MAE, all points would appear exactly on the diagonal cyan line. However, no model is 100% accurate, and we can see that most points are close to the diagonal line which means the predictions are close to the actual values.

There are a few outliers, some of which are off by a lot. These bring the average MAE of our model up drastically. In reality, for most of these points, the MAE is much less than 17,239.

We can inspect these points and find out if we can perform some more data preprocessing and feature engineering to make the model predict them more accurately.


In this tutorial, we've built a deep learning model using Keras, compiled it, fitted it with the clean data we've prepared and finally - performed predictions based on what it's learned.

While not 100% accurate, we managed to get some very decent results with a small number of outliers.

August 12, 2020 01:41 PM UTC