skip to navigation
skip to content

Planet Python

Last update: September 28, 2016 01:49 AM

September 27, 2016


Python Engineering at Microsoft

Microsoft’s participation in the 2016 Python core sprint

From September 5th to the 9th a group of Python core developers gathered for a sprint hosted at Instagram and sponsored by Instagram, Microsoft, and the Python Software Foundation. The goal was to spend a week working towards the Python 3.6.0b1 release, just in time for the Python 3.6 feature freeze on Monday, September 12, 2016. The inspiration for this sprint was the Need for Speed sprint held in Iceland a decade ago, where many performance improvements were made to Python 2.5. How time flies!

That’s the opening paragraph from the Python Insider blog post discussing the 2016 Python core sprint that recently took place. In the case of Microsoft’s participation in the sprint, both Steve Dower and I (Brett Cannon) were invited to participate (which meant Microsoft had one of the largest company representations at the sprint). Between the two of us we spent the week completing work on four of our own PEPs for Python 3.6:

  1. Adding a file system path protocol (PEP 519)
  2. Adding a frame evaluation API to CPython (PEP 523)
  3. Change Windows console encoding to UTF-8 (PEP 528)
  4. Change Windows filesystem encoding to UTF-8 (PEP 529)

I also helped review the patches implementing PEP 515 and PEP 526 (“Underscores in Numeric Literals” and “Syntax for Variable Annotations”, respectively). Both Steve and I also participated in many technical discussions on various topics and we cleared out our backlog of bug reports.

If you’re curious as to what else has made it into Python 3.6 (so far), the rough draft of the “What’s New” document for Python 3.6 is a good place to start (feature freeze has been reached, so no new features will be added to Python 3.6 but bugs are constantly being fixed). We also strongly encourage everyone to download Python 3.6.0b1 and try it with their code. If you find bugs, please file a bug report. There are also various features which will stay or be removed based on community feedback, so please do give this beta release a try!

Overall the week was very productive not only for the two of us but for everyone at the sprints and Python as a project. We hope that the success of this sprint will help lead to it becoming an annual event so that Python can benefit from such a huge burst of productivity every year. And if you or your company want to help with these sorts of sprints in the future or Python’s development in general, then please consider helping with Python’s development and/or sponsoring the Python Software Foundation.

 

September 27, 2016 07:11 PM


Weekly Python Chat

Decorators: The Function's Function

Decorators are one of those features in Python that people like to talk about.

Why? Because they're different. Because they're a little weird. Because they're a little mind-bending.

Let's talk about decorators: how do you make them and when should you use them?

September 27, 2016 06:30 PM


Mike Driscoll

wxPython Cookbook Available for Pre-Order

I am excited to announce that the wxPython Cookbook is now available for Pre-Order. You can get your digital copy on Gumroad or Leanpub now. You can get a sample of the book on Leanpub if you’d like to “try before you buy”.

There will be over 50 recipes in this book. The examples in my book will work with both wxPython 3.0.2 Classic as well as wxPython Phoenix, which is the bleeding edge of wxPython that supports Python 3. If I discover any recipes that do not work with Phoenix, they will be clearly marked or there will be an alternative example given that does work.

wxpython_cookbook_final

Here is a partial listing of the current set of recipes in no particular order:

  • Adding / Removing Widgets Dynamically
  • How to put a background image on a panel
  • Binding Multiple Widgets to the Same Handler
  • Catching Exceptions from Anywhere
  • wxPython’s Context Managers
  • Converting wx.DateTime to Python datetime
  • Creating an About Box
  • How to Create a Login Dialog
  • How to Create a “Dark Mode”
  • Generating a Dialog from a Config File
  • How to Disable a Wizard’s Next Button
  • How to Use Drag and Drop
  • How to Drag and Drop a File From Your App to the OS
  • How to Edit Your GUI Interactively Using reload()
  • How to Embed an Image in the Title Bar
  • Extracting XML from the RichTextCtrl
  • How to Fade-in a Frame / Dialog
  • How to Fire Multiple Event Handlers
  • Making your Frame Maximize or Full Screen
  • Using wx.Frame Styles
  • Get the Event Name Instead of an Integer
  • How to Get Children Widgets from a Sizer
  • How to Use the Clipboard
  • Catching Key and Char Events
  • Learning How Focus Works in wxPython
  • Making Your Text Flash
  • Minimizing to System Tray
  • Using ObjectListView instead of ListCtrl

You can read more about the project in my Kickstarter announcement article. Please note that the Kickstarter campaign is over.

Related Posts

September 27, 2016 05:15 PM


Continuum Analytics News

Continuum Analytics Joins Forces with IBM to Bring Open Data Science to the Enterprise

Tuesday, September 27, 2016

Optimized Python experience empowers data scientists to develop advanced open source analytics on Spark   
 
AUSTIN, TEXAS—September 27, 2016—Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, today announced an alliance with IBM to advance open source analytics for the enterprise. Data scientists and data engineers in open source communities can now embrace Python and R to develop analytic and machine learning models in the Spark environment through its integration with IBM's Project DataWorks. 

Combining the power of IBM's Project DataWorks with Anaconda enables organizations to build high-performance Python and R data science models and visualization applications required to compete in today’s data-driven economy. The companies will collaborate on several open source initiatives including enhancements to Apache Spark that fully leverage Jupyter Notebooks with Apache Spark – benefiting the entire data science community.

“Our strategic relationship with Continuum Analytics empowers Project DataWorks users with full access to the Anaconda platform to streamline and help accelerate the development of advanced machine learning models and next-generation analytics apps,” said Ritika Gunnar, vice president, IBM Analytics. “This allows data science professionals to utilize the tools they are most comfortable with in an environment that reinforces collaboration with colleagues of different skillsets.”

By collaborating to bring about the best Spark experience for Open Data Science in IBM's Project DataWorks, enterprises are able to easily connect their data, analytics and compute with innovative machine learning to accelerate and deploy their data science solutions. 

“We welcome IBM to the growing family of industry titans that recognize Anaconda as the defacto Open Data Science platform for enterprises,” said Michele Chambers, EVP of Anaconda Business & CMO at Continuum Analytics. “As the next generation moves from machine learning to artificial intelligence, cloud-based solutions are key to help companies adopt and develop agile solutions––IBM recognizes that. We’re thrilled to be one of the driving forces powering the future of machine learning and artificial intelligence in the Spark environment.”

IBM's Project Dataworks the industry’s first cloud-based data and analytics platform that integrates all types of data to enable AI-powered decision making. With this, companies are able to realize the full promise of data by enabling data professionals to collaborate and build cognitive solutions by combining IBM data and analytics services and a growing ecosystem of data and analytics partners - all delivered on Apache Spark. Project Dataworks is designed to allow for faster development and deployment of data and analytics solutions with self-service user experiences to help accelerate business value. 

To learn more, join Bob Picciano, SVP of IBM Analytics and Travis Oliphant, CEO of Continuum Analytics at the IBM DataFirst Launch Event on Sept 27, 2016, Hudson Mercantile Building in NYC. The event is also available on livestream.

About Continuum Analytics
Continuum Analytics is the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python. We put superpowers into the hands of people who are changing the world. 

With more than 3M downloads and growing, Anaconda is trusted by the world’s leading businesses across industries––financial services, government, health & life sciences, technology, retail & CPG, oil & gas––to solve the world’s most challenging problems. Anaconda does this by helping everyone in the data science team discover, analyze and collaborate by connecting their curiosity and experience with data. With Anaconda, teams manage their Open Data Science environments without any hassles to harness the power of the latest open source analytic and technology innovations. 

Our community loves Anaconda because it empowers the entire data science team––data scientists, developers, DevOps, architects and business analysts––to connect the dots in their data and accelerate the time-to-value that is required in today’s world. To ensure our customers are successful, we offer comprehensive support, training and professional services. 

Continuum Analytics' founders and developers have created and contributed to some of the most popular Open Data Science technologies, including NumPy, SciPy, Matplotlib, Pandas, Jupyter/IPython, Bokeh, Numba and many others. Continuum Analytics is venture-backed by General Catalyst and BuildGroup. 

To learn more, visit http://www.continuum.io.

###
 
Media Contact:
Jill Rosenthal
InkHouse
continuumanalytics@inkhouse.com 

September 27, 2016 12:18 PM


Python Insider

Python Core Development Sprint 2016: 3.6 and beyond!

From September 5th to the 9th a group of Python core developers gathered for a sprint hosted at Instagram and sponsored by Instagram, Microsoft, and the Python Software Foundation. The goal was to spend a week working towards the Python 3.6b1 release, just in time for the Python 3.6 feature freeze on Monday, September 12, 2016. The inspiration for this sprint was the Need for Speed sprint held in Iceland a decade ago, where many performance improvements were made to Python 2.5. How time flies!


By any measurement, the sprint was extremely successful. All participants left feeling accomplished both in the work they did and in the discussions they held. Being in the same room encouraged many discussions related to various PEPs, and many design decisions were made. There was also the camaraderie of working in the same room together — typically most of us only see each other at the annual PyCon US, where there are other distractions that prevent getting to simply spend time together. (This includes the Python development sprints at PyCon, where the focus is more on helping newcomers contribute to Python — that’s why this sprint was not public.)


From a quantitative perspective, the sprint was the most productive week for Python ever! According to the graphs from the GitHub mirror of CPython, the week of September 4th saw more commits than the preceding 7 weeks combined! And in terms of issues, the number of open issues dropped by 62 with a total of 166 issues closed.


A large portion of the work performed during the sprint week revolved around various PEPs that had either not yet been accepted or were not yet fully implemented. In the end, 12 PEPs were either implemented from scratch or had their work completed during the week, out of Python 3.6’s total of 16 PEPs:


  1. Preserving the order of **kwargs in a function (PEP 468)
  2. Add a private version to dict (PEP 509)
  3. Underscores in Numeric Literals (PEP 515)
  4. Adding a file system path protocol (PEP 519)
  5. Preserving Class Attribute Definition Order (PEP 520)
  6. Adding a frame evaluation API to CPython (PEP 523)
  7. Make os.urandom() blocking on Linux, add os.getrandom() (PEP 524)
  8. Asynchronous Generators (PEP 525)
  9. Syntax for Variable Annotations (PEP 526)
  10. Change Windows console encoding to UTF-8 (PEP 528)
  11. Change Windows filesystem encoding to UTF-8 (PEP 529)
  12. Asynchronous Comprehensions (PEP 530)

Some large projects were also worked on that are not represented as PEPs. For instance, Python 3.6 now contains support for DTrace and SystemTap. This will give people more tools to introspect and monitor Python. See the HOWTO for usage instructions and examples showing some of the new possibilities.


CPython also gained a more memory efficient dictionary implementation at the sprint. The new implementation shrinks memory usage of dictionaries by about 25% and also preserves insertion order, without speed penalties. Based on a proposal by Raymond Hettinger, the patch was written by INADA Naoki prior to the sprint but it was reviewed and heavily discussed at the sprint, as changing the underlying implementation of dictionaries can have profound implications on Python itself. In the end, the patch was accepted, directly allowing for PEP 468 to be accepted and simplifying PEP 520.


Work was also done on the Gilectomy (see the presentation on the topic from PyCon US for more background info on the project). Progress was made such that Python would run without any reference counting turned on (i.e. Python turned into a huge memory leak). Work then was started on trying the latest design on how to turn reference counting back on in a way that would allow Python to scale with the number of threads and CPU cores used. There’s still a long road ahead before the Gilectomy will be ready to merge though, and we even jokingly considered branding the result as Python 4.


Much of the work done during the sprint led not only to improvements in the language and library, but to better performance as well. A quick performance comparison between Python 3.5.2+ and 3.6b1+ under OS X shows that 3.6 is generally faster, with double-digit speed improvements not uncommon. Similar benchmarking under Windows 10 has been reported to show similar performance gains.


A huge thanks goes out to the participants of the sprints! They are listed below in alphabetical order, along with thanks to the organizations that helped finance their attendance. Many of them traveled to attend and gave up the US Labor Day holiday with their families to participate. In the end, we had participants from 3 countries on 2 continents (We actually invited more people from other countries and continents, but not everybody invited could attend.)



Special thanks to Łukasz for making the event happen and to Larry for designing the logo.


September 27, 2016 11:03 AM


The Digital Cat

Python Mocks: a gentle introduction - Part 2

In the first post I introduced you to Python mocks, objects that can imitate other objects and work as placeholders, replacing external systems during unit testing. I described the basic behaviour of mock objects, the return_value and side_effect attributes, and the assert_called_with() method.

In this post I will briefly review the remaining assert_* methods and some interesting attributes that allow to check the calls received by the mock object. Then I will introduce and exemplify patching, which is a very important topic in testing.

Other assertions and attributes

The official documentation of the mock library lists many other assertion, namely assert_called_once_with(), assert_any_call(), assert_has_calls(), assert_not_called(). If you grasped how assert_called_with() works, you will have no troubles in understanding how those other behave. Be sure to check the documentation to get a full description of what mock object can assert about their history after being used by your code.

Together with those methods, mock objects also provide some useful attributes, two of which have been already reviewed in the first post. The remaining attributes are as expected mostly related to calls, and are called, call_count, call_args, call_args_list, method_calls, mock_calls. While these also are very well descripted in the official documentation, I want to point out the two method_calls and mock_calls attributes, that store the detailed list of methods which are called on the mock, and the call_args_list attribute that lists the parameters of every call.

Do not forget that methods called on a mock object are mocks themselves, so you may first access the main mock object to get information about the called methods, and then access those methods to get the arguments they received.

Patching

Mocks are very simple to introduce in your tests whenever your objects accept classes or instances from outside. In that case, as described, you just have to instantiate the Mock class and pass the resulting object to your system. However, when the external classes instantiated by your library are hardcoded this simple trick does not work. In this case you have no chance to pass a mock object instead of the real one.

This is exactly the case addressed by patching. Patching, in a testing framework, means to replace a globally reachable object with a mock, thus achieving the target of having the code run unmodified, while part of it has been hot swapped, that is, replaced at run time.

A warm-up example

Let us start with a very simple example. Patching can be complex to grasp at the beginning so it is better to learn it with trivial code. If you do not have it yet, create the testing environment mockplayground with the instruction given in the previous post.

I want to develop a simple class that returns information about a given file. The class shall be instantiated with the filename, which can be a relative path.

For the sake of brevity I will not show you every step of the TDD development of the class. Remember that TDD requires you to write a test and then implement the code, but sometimes this could be too fine grained, so do not use the TDD rules without thinking.

The tests for the initialization of the class are

from fileinfo import FileInfo

def test_init():
    filename = 'somefile.ext'
    fi = FileInfo(filename)
    assert fi.filename == filename

def test_init():
    filename = 'somefile.ext'
    relative_path = '../{}'.format(filename)
    fi = FileInfo(relative_path)
    assert fi.filename == filename

You can put them into the tests/test_fileinfo.py file. The code that makes the tests pass could be something like

import os


class FileInfo:
    def __init__(self, path):
        self.original_path = path
        self.filename = os.path.basename(path)

Up to now I didn't introduce any new feature. Now I want the get_info() function to return a tuple with the file name, the original path the class was instantiated with, and the absolute path of the file.

You immediately realise that you have an issue in writing the test. There is no way to easily test something as "the absolute path", since the outcome of the function called in the test is supposed to vary with the path of the test itself. Let us write part of the test

def test_get_info():
    filename = 'somefile.ext'
    original_path = '../{}'.format(filename)
    fi = FileInfo(original_path)
    assert fi.get_info() == (filename, original_path, '???')

where the '???' string highlights that I cannot put something sensible to test the absolute path of the file.

Patching is the way to solve this problem. You know that the function will use some code to get the absolute path of the file. So in the scope of the test only you can replace that code with different code and perform the test. Since the replacement code has a known outcome writing the test is now possible.

Patching, thus, means to inform Python that in some scope you want a globally accessible module/object replaced by a mock. Let's see how we can use it in our example

from unittest.mock import patch

[...]

def test_get_info():
    filename = 'somefile.ext'
    original_path = '../{}'.format(filename)

    with patch('os.path.abspath') as abspath_mock:
        test_abspath = 'some/abs/path'
        abspath_mock.return_value = test_abspath
        fi = FileInfo(original_path)
        assert fi.get_info() == (filename, original_path, test_abspath)

Remember that if you are using Python 2 you installed the mock module with pip, so your import statement becomes form mock import patch.

You clearly see the context in which the patching happens, as it is enclosed in a with statement. Inside this statement the module os.path.abspath will be replaced by a mock created by the function patch and called abspath_mock. We can now give the function a return_value as we did with standard mocks in the first post and run the test.

The code that make the test pass is

class FileInfo:
    [...]

    def get_info(self):
        return self.filename, self.original_path, os.path.abspath(self.filename)

Obviously to write the test you have to know that you are going to use the os.path.abspath function, so patching is somehow a "less pure" practice in TDD. In pure OOP/TDD you are only concerned with the external behaviour of the object, and not with its internal structure. This example, however, shows that you have to cope with some real world issues, and patching is a clean way to do it.

The patching decorator

The patch function we imported from the unittest.mock module is very powerful, and can be used as a function decorator as well. When used in this fashion you need to change the decorated function to accept a mock as last argument.

@patch('os.path.abspath')
def test_get_info(abspath_mock):
    filename = 'somefile.ext'
    original_path = '../{}'.format(filename)

    test_abspath = 'some/abs/path'
    abspath_mock.return_value = test_abspath
    fi = FileInfo(original_path)
    assert fi.get_info() == (filename, original_path, test_abspath)

As you can see the patch decorator works like a big with statement for the whole function. Obviously in this way you replace the target function os.path.abspath in the scope of the whole function. It is then up to you to decide if you need to use patch as a decorator or in a with block.

Multiple patches

We can also patch more that one object. Say for example that we want to change the above test to check that the outcome of the FileInfo.get_info() method also contains the size of the file. To get the size of a file in Python we may use the os.path.getsize() function, which returns the size of the file in bytes.

So now we have to patch os.path.getsize as well, and this can be done with another patch decorator.

@patch('os.path.getsize')
@patch('os.path.abspath')
def test_get_info(abspath_mock, getsize_mock):
    filename = 'somefile.ext'
    original_path = '../{}'.format(filename)

    test_abspath = 'some/abs/path'
    abspath_mock.return_value = test_abspath

    test_size = 1234
    getsize_mock.return_value = test_size

    fi = FileInfo(original_path)
    assert fi.get_info() == (filename, original_path, test_abspath, test_size)

Please notice that the decorator which is nearest to the function is applied first. Always remember that the decorator syntax with @ is a shortcut to replace the function with the output of the decorator, so two decorators result in

@decorator1
@decorator2
def myfunction():
    pass

which is a shorcut for

def myfunction():
    pass
myfunction = decorator1(decorator2(myfunction))

This explains why, in the test code, the function receives first abspath_mock and then getsize_mock. The first decorator applied to the function is the patch of os.path.abspath, which appends the mock that we call abspath_mock. Then the patch of os.path.getsize is applied and this appends its own mock.

The code that makes the test pass is

class FileInfo:
    [...]

    def get_info(self):
        return self.filename, self.original_path, os.path.abspath(self.filename), os.path.getsize(self.filename)

We can write the above test using two with statements as well

def test_get_info():
    filename = 'somefile.ext'
    original_path = '../{}'.format(filename)

    with patch('os.path.abspath') as abspath_mock:
        test_abspath = 'some/abs/path'
        abspath_mock.return_value = test_abspath

        with patch('os.path.getsize') as getsize_mock:
            test_size = 1234
            getsize_mock.return_value = test_size

            fi = FileInfo(original_path)
            assert fi.get_info() == (filename, original_path, test_abspath, test_size)

Using more than one with statement, however, makes the code difficult to read, in my opinion, so in general I prefer to avoid complex with trees if I do not need a limited scope of the patching.

Patching immutable objects

The most widespread version of Python is CPython, which is written, as the name suggests, in C. Part of the standard library is also written in C, while the rest is written in Python itself.

The objects (classes, modules, functions, etc) that are implemented in C are shared between interpreters, which is something that you can do embedding the Python interpreter in a C program, for example. This requires those objects to be immutable, so that you cannot alter them at runtime from a single interpreter.

For an example of this immutability just check the following code

>>> a = 1
>>> a.conjugate = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object attribute 'conjugate' is read-only

Here I'm trying to replace a method with an integer, which is pointless, but nevertheless shows the issue we are facing.

What has this immutability to do with patching? What patch does is actually to temporarily replace an attibute of an object (method of a class, class of a module, etc), so if that object is immutable the patching action fails.

A typical example of this problem is the datetime module, which is also one of the best candidates for patching, since the output of time functions is by definition time-varying.

Let me show the problem with a simple class that logs operations. The class is the following (you can put it into a file called logger.py)

import datetime

class Logger:
    def __init__(self):
        self.messages = []

    def log(self, message):
        self.messages.append((datetime.datetime.now(), message))

This is pretty simple, but testing this code is problematic, because the log() method produces results that depend on the actual execution time.

If we try to write a test patching datetime.datetime.now we have a bitter surprise. This is the test code, that you can put in tests/test_logger.py

from unittest.mock import patch

from logger import Logger

def test_init():
    l = Logger()
    assert l.messages == []

@patch('datetime.datetime.now')
def test_log(mock_now):
    test_now = 123
    test_message = "A test message"
    mock_now.return_value = test_now

    l = Logger()
    l.log(test_message)
    assert l.messages == [(test_now, test_message)]

and the execution of pytest returns a TypeError: can't set attributes of built-in/extension type 'datetime.datetime', which is exactly a problem of immutability.

There are several ways to address this problem, but all of them leverage the fact that, when you import of subclass an immutable object what you get is a "copy" of that is now mutable.

The easiest example in this case is the module datetime itself. In the test_log function we try to patch directly the datetime.datetime.now object, affecting the builtin module datetime. The file logger.py, however, does import datetime, so that this latter becomes a local symbol in the logger module. This is exactly the key for our patching. Let us change the code to

@patch('logger.datetime.datetime')
def test_log(mock_datetime):
    test_now = 123
    test_message = "A test message"
    mock_datetime.now.return_value = test_now

    l = Logger()
    l.log(test_message)
    assert l.messages == [(test_now, test_message)]

As you see running the test now the patching works. What we did was to patch logger.datetime.datetime instead of datetime.datetime.now. Two things changed, thus, in our test. First, we are patching the module imported in the logger.py file and not the module provided globally by the Python interpreter. Second, we have to patch the whole module because this is what is imported by the logger.py file. If you try to patch logger.datetime.datetime.now you will find that it is still immutable.

Another possible solution to this problem is to create a function that invokes the immutable object and returns its value. This last function can be easily patched, because it just uses the builtin objects and thus is not immutable. This solution, however, requires to change the source code to allow testing, which is far from being desirable. Obviously it is better to introduce a small change in the code and have it tested than to leave it untested, but whenever is possible I avoid solutions that introduce code which wouldn't be required without tests.

Final words

In this second part of this small series on Python testing we reviewd the patching mechanism and run through some of its subleties. Patching is a really effective technique, and patch-based tests can be found in many different packages. Take your time to become confident with mocks and patching, since they will be one of your main tools while working with Python and any other object-oriented language.

As always, I strongly recommend finding some time to read the official documentation of the mock library.

Feedback

Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections

September 27, 2016 09:00 AM


Semaphore Community

Mocks and Monkeypatching in Python

This article is brought with ❤ to you by Semaphore.

Post originally published on http://krzysztofzuraw.com/. Republished with author's permission.

Introduction

In this post I will look into the essential part of testing — mocks.

First of all, what I want to accomplish here is to give you basic examples of how to mock data using two tools — mock and pytest monkeypatch.

Why bother mocking?

Some of the parts of our application may have dependencies for other libraries or objects. To isolate the behaviour of our parts, we need to substitute external dependencies. Here comes the mocking. We mock an external API to check certain behaviours, such as proper return values, that we previously defined.

Mocking function

Let’s say we have a module called function.py:

def square(value):
    return value ** 2

def cube(value): 
    return value ** 3

def main(value): 
    return square(value) + cube(value)

Then let’s see how these functions are mocked using the mock library:

    try:
        import mock
    except ImportError:
        from unittest import mock

    import unittest

    from function import square, main


    class TestNotMockedFunction(unittest.TestCase):

        @mock.patch('__main__.square', return_value=1)
        def test_function(self, mocked_square):
            # because you need to patch in exact place where function that has to be mocked is called
            self.assertEquals(square(5), 1)

        @mock.patch('function.square')
        @mock.patch('function.cube')
        def test_main_function(self, mocked_square, mocked_cube):
            # underling function are mocks so calling main(5) will return mock
            mocked_square.return_value = 1
            mocked_cube.return_value = 0
            self.assertEquals(main(5), 1)
            mocked_square.assert_called_once_with(5)
            mocked_cube.assert_called_once_with(5)


    if __name__ == '__main__':
        unittest.main()

What is happening here? Lines 1-4 are for making this code compatible between Python 2 and 3. In Python 3, mock is part of the standard library, whereas in Python 2 you need to install it by pip install mock.

In line 13, I patched the square function. You have to remember to patch it in the same place you use it. For instance, I’m calling square(5) in the test itself so I need to patch it in __main__. This is the case if I’m running this by using python tests/test_function.py. If I’m using pytest for that, I need to patch it as test_function.square.

In lines 18-19, I patch the square and cube functions in their module because they are used in the main function. The last two asserts come from the mock library, and are there to make sure that mock was called with proper values.

The same can be accomplished using mokeypatching for py.test:

from function import square, main

def test_function(monkeypatch):
    monkeypatch.setattr(test_function_pytest.square, lambda x: 1)
    assert square(5) == 1

def test_main_function(monkeypatch): 
    monkeypatch.setattr(function.square, lambda x: 1) 
    monkeypatch.setattr(function.cube, lambda x: 0) 
    assert main(5) == 1

As you can see, I’m using monkeypatch.setattr for setting up a return value for given functions. I still need to monkeypatch it in proper places — test_function_pytest and function.

Mocking classes

I have a module called square:

import math

class Square(object): 
    def __init__(radius): 
        self.radius = radius

        def calculate_area(self):
            return math.sqrt(self.radius) * math.pi 

and mocks using standard lib:

try: 
    import mock 
except ImportError: 
    from unittest import mock

import unittest

from square import Square

class TestClass(unittest.TestCase):

       @mock.patch('__main__.Square') # depends in witch from is run
       def test_mocking_instance(self, mocked_instance):
           mocked_instance = mocked_instance.return_value
           mocked_instance.calculate_area.return_value = 1
           sq = Square(100)
           self.assertEquals(sq.calculate_area(), 1)


       def test_mocking_classes(self):
           sq = Square
           sq.calculate_area = mock.MagicMock(return_value=1)
           self.assertEquals(sq.calculate_area(), 1)

       @mock.patch.object(Square, 'calculate_area')
       def test_mocking_class_methods(self, mocked_method):
           mocked_method.return_value = 20
           self.assertEquals(Square.calculate_area(), 20)

if __name__ == __main__:
    unittest.main()

At line 13, I patch the class Square. Lines 15 and 16 present a mocking instance. mocked_instance is a mock object which returns another mock by default, and to these mock.calculate_area I add return_value 1. In line 23, I’m using MagicMock, which is a normal mock class, except in that it also retrieves magic methods from the given object. Lastly, I use patch.object to mock the method in the Square class.

The same using pytest:

try: 
    from mock import MagicMock 
except ImportError: 
    from unittest.mock import MagicMock

from square import Square

def test_mocking_class_methods(monkeypatch):
    monkeypatch.setattr('test_class_pytest.Square.calculate_area', lambda: 1)
    assert Square.calculate_area() ==  1


def test_mocking_classes(monkeypatch):
    monkeypatch.setattr('test_class_pytest.Square', MagicMock(Square))
    sq = Square
    sq.calculate_area.return_value = 1
    assert sq.calculate_area() ==  1

The issue here is with test_mocking_class_methods, which works well in Python 3, but not in Python 2.

All examples can be found in this repo.

If you have any questions and comments, feel free to leave them in the section below.

References:

  1. What is Mocking?
  2. Mocking With Kung Fu Panda

This article is brought with ❤ to you by Semaphore.

September 27, 2016 07:31 AM


Experienced Django

Shallow Dive into Django ORM

A Closer Look at the Django ORM and Many-To-Many Relationships

In the last post I worked some on the data model for the KidsTasks app and discovered that a many-to-many relationship would not allow multiple copies of the same task to exist in a given schedule. Further reading showed me, without much explanation, that using a “through” parameter on the relationship definition fixed that. In this post I want to take a closer look at what’s going on in that django model magic.

Django Shell

As part of my research for this topic, I was lead to a quick description of the Django shell which is great for testing out ideas and playing with the models you’re developing. I found a good description here.  (which also gives a look at filters and QuerySets).

Additionally, I’ll note for anyone wanting to play along at home, that the following sequence of commands was quite helpful to have handy when testing different models.

 $ rm tasks/migrations db.sqlite3 -rf
 $ ./manage.py makemigrations tasks
 $ ./manage.py migrate
 $ ./manage.py shell
 Python 3.4.3 (default, Oct 14 2015, 20:33:09)
 [GCC 4.8.4] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 (InteractiveConsole)

Many To Many without an Intermediate Class

I’ll start by examining what happened with my original model design where a DayOfWeekSchedule had a ManyToMany relationship with Task.

Simple Solution Code

The simplified model I’ll use here looks like this.

class Task(models.Model):
 name = models.CharField(max_length=256)
 required = models.BooleanField()

def __str__(self):
 return self.name

class DayOfWeekSchedule(models.Model):
 tasks = models.ManyToManyField(Task)
 name = models.CharField(max_length=20)

def __str__(self):
 return self.name

Note that the ManyToMany field directly accesses the Task class. (Also note that I retained the __str__ methods to make the shell output more meaningful.)

Experiment

In the shell experiment show in the listing below, I set up a few Tasks and
a couple of DayOfWeekSchedules and then add “first task” and “second
task” to one of the schedules. Once this is done, I attempt to add “first
task” to the schedule again and we see that it does not have the desired
effect.

>>> # import our models
>>> from tasks.models import Task, DayOfWeekSchedule
>>>
>>> # populate our database with some simple tasks and schedules
>>> Task.objects.create(name="first task", required=False)
<Task: first task>
>>> Task.objects.create(name="second task", required=True)
<Task: second task>
>>> Task.objects.create(name="third task", required=False)
<Task: third task>
>>> DayOfWeekSchedule.objects.create(name="sched1")
<DayOfWeekSchedule: sched1>
>>> DayOfWeekSchedule.objects.create(name="sched2")
<DayOfWeekSchedule: sched2>
>>> Task.objects.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]>
>>> DayOfWeekSchedule.objects.all()
<QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]>
>>>
>>> # add a task to a schedule
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>]>
>>>
>>> # add other task to that schedule
>>> t = Task.objects.get(name='second task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>
>>>
>>> # attempt to add the first task to the schedule again
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>

Note that at the end, we still only have a single copy of “first task” in the schedule.

Many To Many with an Intermediate Class

Now we’ll retry the experiment with the “through=” intermediate class specified in the ManyToMany relationship.

Not-Quite-As-Simple Solution Code

The model code for this is quite similar.  Note the addition of the “through=” option and of the DayTask class.

from django.db import models

class Task(models.Model):
 name = models.CharField(max_length=256)
 required = models.BooleanField()

def __str__(self):
 return self.name

class DayOfWeekSchedule(models.Model):
 tasks = models.ManyToManyField(Task, through='DayTask')
 name = models.CharField(max_length=20)

def __str__(self):
 return self.name

class DayTask(models.Model):
 task = models.ForeignKey(Task)
 schedule = models.ForeignKey(DayOfWeekSchedule)

Experiment #2

This script is as close as possible to the first set.  The only difference being the extra steps we need to take to add the ManyToMany relationship.  We need to manually create the object of DayTask, initializing it with the Task and Schedule objects and then saving it.  While this is slightly more cumbersome in the code, it does produce the desired results; two copies of “first task” are present in the schedule at the end.

>>> # import our models
>>> from tasks.models import Task, DayOfWeekSchedule, DayTask
>>>
>>> # populate our database with some simple tasks and schedules
>>> Task.objects.create(name="first task", required=False)
<Task: first task>
>>> Task.objects.create(name="second task", required=True)
<Task: second task>
>>> Task.objects.create(name="third task", required=False)
<Task: third task>
>>> DayOfWeekSchedule.objects.create(name="sched1")
<DayOfWeekSchedule: sched1>
>>> DayOfWeekSchedule.objects.create(name="sched2")
<DayOfWeekSchedule: sched2>
>>> Task.objects.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]>
>>> DayOfWeekSchedule.objects.all()
<QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]>
>>>
>>> # add a task to a schedule
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> # cannot simply add directly, must create intermediate object see
>>> # https://docs.djangoproject.com/en/1.9/topics/db/models/#extra-fields-on-many-to-many-relationships
>>> # s.tasks.add(t)
>>> d1 = DayTask(task=t, schedule=s)
>>> d1.save()
>>> s.tasks.all()
<QuerySet [<Task: first task>]>
>>>
>>> # add other task to that schedule
>>> t = Task.objects.get(name='second task')
>>> dt2 = DayTask(task=t, schedule=s)
>>> dt2.save()
>>> # s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>
>>>
>>> # attempt to add the first task to the schedule again
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> dt3 = DayTask(task=t, schedule=s)
>>> dt3.save()
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: first task>]>

But…Why?

The short answer is that I’m not entirely sure why the intermediate class is needed to allow multiple instances.  It’s fairly clear that it is tied to how the Django code manages those relationships.  Evidence confirming that can be seen in the migration script generated for each of the models.

The first model generates these operations:

operations = [
 migrations.CreateModel(
 name='DayOfWeekSchedule',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=20)),
 ],
 ),
 migrations.CreateModel(
 name='Task',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=256)),
 ('required', models.BooleanField()),
 ],
 ),
 migrations.AddField(
 model_name='dayofweekschedule',
 name='tasks',
 field=models.ManyToManyField(to='tasks.Task'),
 ),
 ]

Notice the final AddField call which adds “tasks” to the “dayofweekschedule” model directly.

The second model (shown above) generates a slightly different set of migration operations:

operations = [
 migrations.CreateModel(
 name='DayOfWeekSchedule',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=20)),
 ],
 ),
 migrations.CreateModel(
 name='DayTask',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('schedule', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.DayOfWeekSchedule')),
 ],
 ),
 migrations.CreateModel(
 name='Task',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=256)),
 ('required', models.BooleanField()),
 ],
 ),
 migrations.AddField(
 model_name='daytask',
 name='task',
 field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.Task'),
 ),
 migrations.AddField(
 model_name='dayofweekschedule',
 name='tasks',
 field=models.ManyToManyField(through='tasks.DayTask', to='tasks.Task'),
 ),
 ]

This time it adds task to the daytask and dayofweekschedule classes.  I have to admit here that I really wanted this to show the DayTask object being used in the DayOfWeekSchedule class as a proxy, but that’s not the case.

Examining the databases generated by these two models showed no significant differences there, either.

A Quick Look at the Source

One of the beauties of working with open source software is the ability to dive in and see for yourself what’s going on.  Looking at the Django source, you can find the code that adds a relationship in django/db/models/fields/related_descriptors.py (at line 918 in the version I checked out).

        def add(self, *objs):
            ... stuff deleted ...
            self._add_items(self.source_field_name, 
                            self.target_field_name, *objs)

(actually _add_items can be called twice, once for a forward and once for a reverse relationship).  Looking at _add_items (line 1041 in my copy), we see after building the list of new_ids to insert, this chunk of code:

                db = router.db_for_write(self.through, 
                                         instance=self.instance)
                vals = (self.through._default_manager.using(db)
                        .values_list(target_field_name, flat=True)
                        .filter(**{
                            source_field_name: self.related_val[0],
                            '%s__in' % target_field_name: new_ids,
                        }))
                new_ids = new_ids - set(vals)

which I suspect of providing the difference.  This code gets the list of current values in the relation table and removes that set from the set of new_ids.  I believe that the filter here will respond differently if we have a intermediate class defined.  NOTE: I did not run this code live to test this theory, so if I’m wrong, feel free to point out how and where in the comments.

Even if this is not quite correct, after walking through some code, I’m satisfied that the intermediate class definitely causes some different behavior internally in Django.

Next time I’ll jump back into the KidsTasks code.

Thank for reading!

September 27, 2016 01:57 AM


Daniel Bader

Python Code Review: Unplugged – Episode 2

Python Code Review: Unplugged – Episode 2

This is the second episode of my video code review series where I record myself giving feedback and refactoring a reader’s Python code.

The response to the first Code Review: Unplugged video was super positive. I got a ton of emails and comments on YouTube saying that the video worked well as a teaching tool and that I should do more of them.

And so I did just that 😃. Milton sent me a link to his Python 3 project on GitHub and I recorded another code review based on his code. You can watch it below:

Milton is on the right track with his Python journey. I liked how he used functions to split up his web scraper program into functions that each handle a different phase, like fetch the html, parse it, and generate the output file.

The main thing that this code base could benefit from would be consistent formatting. Making the formatting as regular and consistent as possible really helps with keeping the “mental overhead” low when you’re working on the code or handing it off to someone else.

And the beautiful thing is that there’s an easy fix for this, too. I demo a tool called Flake8 in the video. Flake8 is a code linter and code style checker – and it’s great for making sure your code has consistent formatting and avoids common pitfalls or anti-patterns.

You can even integrate Flake8 into your editing environment so that it checks your code as you write it.

(Shameless plug: The book I’m working on has a whole chapter on integrating Flake8 into the Sublime Text editor. Check it out if you’d like to learn how to set up a Python development environment just like the one I’m using in the video).

Besides formatting, the video also covers things like writing a great GitHub README, how to name functions and modules, and the use of constants to simplify your Python code. So be sure to watch the whole thing when you get the chance.

Again, I left the video completely unedited. That’s why I’m calling this series Code Review: Unplugged. It’s definitely not a polished tutorial or course. But based on the feedback I got so far that seems to be part of the appeal.

Links & Resources:

One more quick tip for you: You can turn these videos into a fun Python exercise for yourself. Just pause the video before I dig into the code and do your own code review first. Spend 10 to 20 minutes taking notes and refactoring the code and then continue with the video to compare your solution with mine. Let me know how this worked out! 😊

September 27, 2016 12:00 AM

September 26, 2016


Django Weblog

Django security releases issued: 1.9.10 and 1.8.15

In accordance with our security release policy, the Django team is issuing Django 1.9.10 and 1.8.15. These release addresses a security issue detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2016-7401: CSRF protection bypass on a site with Google Analytics

An interaction between Google Analytics and Django's cookie parsing could allow an attacker to set arbitrary cookies leading to a bypass of CSRF protection.

Thanks Sergey Bobrov for reporting the issue.

Affected supported versions

  • Django 1.9
  • Django 1.8

Django 1.10 and the master development branch are not affected.

Per our supported versions policy, Django 1.7 and older are no longer receiving security updates.

Resolution

Patches to resolve the issue have been applied to Django's 1.9 and 1.8 release branches. The patches may be obtained from the following changesets:

The following new releases have been issued:

The PGP key ID used for these releases is Tim Graham: 1E8ABDC773EDE252.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

September 26, 2016 06:41 PM


Gocept Weblog

Last minute information for remote Sprinters for the Zope Resurrection Sprint

As the Zope Resurrection sprint is approaching, it seems useful to share some information on the schedule for the three days in Halle. As we have also some sprinters, who can not join on site, but might want to join remotely, a few key facts might come in handy.

Etherpad

There is the Etherpad, where the current list of topics is collected. Most of the stories we are going to tackle will be based on this list. In case you have something to add, or are interested in a specific topic in particular, it is a good idea to add your thoughts before the start of the sprint.

IRC Channel

During the sprint we will communicate via the #sprint channel on irc.freenode.net, so additional information and questions can be placed there.

Google Hangout

As the sprint is also intended to foster discussions about the future of Zope, we want to encourage as many people as possible to join. Therefore, we have a hangout where we will meet.

Scheduled Discussions

At the moment we plan the following session:

So in case you want to contribute remotely to the sprint, please join us on one of the three ways.

 


September 26, 2016 04:17 PM


Curtis Miller

An Introduction to Stock Market Data Analysis with Python (Part 2)

*This post is the second in a two-part series on stock data analysis using Python, based on a lecture I gave on the subject for MATH 3900 (Data Mining) at the University of Utah (read part 1 here). In these posts, I will discuss basics such as obtaining the data from Yahoo! Finance using pandas,…Read more An Introduction to Stock Market Data Analysis with Python (Part 2)

September 26, 2016 03:00 PM


Doug Hellmann

copy — Duplicate Objects — PyMOTW 3

The copy module includes two functions, copy() and deepcopy() , for duplicating existing objects. Read more… This post is part of the Python Module of the Week series for Python 3. See PyMOTW.com for more articles from the series.

September 26, 2016 01:00 PM


Mike Driscoll

PyDev of the Week: Katie McLaughlin

This week we welcome Katie McLaughlin (@glasnt) as our PyDev of the Week! She is a core developer of the BeeWare project. You should take a moment and check out her Github profile to see what fun projects she’s a part of. Katie also has a fun little website and was a speaker at PyCon 2016. Let’s take a few moments to get to know her better!

katie

Can you tell us a little about yourself (hobbies, education, etc):

G’day! I’m Australian, originally from Brisbane, but now living in Sydney. I’ve got a Bachelor of Information Technology, and I’ve been in the tech industry for going on ten years now. I’ve been in a bunch of different roles and technologies, but mostly in web hosting and cloud stuff. When I’m not on a computer or attending conferences, I enjoy cooking and making tapestries.

Why did you start using Python?

To fix a bug in a bit of in-house code! There was a bug in an old script, and I saw the “#!/usr/bin/env python” and learnt from there. I didn’t go back to Python for a few years, but just after I was accepted to PyCon Australia 2015, I thought I should brush up on what little I knew. That’s about a year ago now, and it’s now my go-to language for scripting. I was had previously used Ruby for years, and I only occasionally still automatically type “puts” instead of “print”.

What other programming languages do you know and which is your favorite?

Well! Based on just languages that I’ve been paid to do, I know JavaScript, Haskell, Scala, C, Python, Ruby, Perl, Bash/Shell, Powerscript, Powershell, PL/SQL, and probably a few others in there as well. Add a dozen or so other languages from high school and university (mostly Pascal, Lisp, Poplog, Assembly, ActionScript, C#, Java), and there’s a lot.

But what languages do I know? That’s a tough one. Personally I’d define knowing a language has having a working knowledge of it. Put any language in front of me and I could probably work it out, but writing is completely different.

Given that, I’d say I know JavaScript, Haskell, Python, Ruby & Bash. #polyglotLife

But as for a favourite, I know I adored Poplog back in the day, but I really don’t play favourites with languages. I use a programming language in an environment to articulate a solution specific to that environment. Using a favourite language in an environment were it doesn’t belong doesn’t do anyone any favours. The right tool for the job, etc 🙂

What projects are you working on now?

My major open source project right now is BeeWare, of which I’m a core developer. You may have heard of it as that project with the shiny coins. BeeWare is a set of tools and libraries that allow you to write an applications in Python and deploy them anywhere. Not just on the web, but on Android and iOS.

I’m really excited by what Russell Keith-Magee, the founding Apiarist of the BeeWare project, has been able to achieve so far. It’s very much a work in progress, but it has a bright future.

I’m also the core dev on two other projects: octohatrack, an application that shows the total number of contributors to a GitHub project, not just those who contribute code to master; and emojificate, a module and Django template helper that helps to make emoji more accessible on the web.

Which Python libraries are your favorite (core or 3rd party)?

I do enjoy the usability of requests, and unicodedata is also fun. I’ve been spending far too much time with boto3 recently, though.

Where do you see Python going as a programming language?

Python, now 25 year old, has generally been a server-only language. With the advent of Django about 10 years ago, it moved into web. And there’s also foundations of Python in other spaces, such as in data science and education.

However, Python runs the risk of being left behind as more development happens away from the server. A recent IEEE survey of languages puts Python up the top of the list, but without a solution for embedded systems or mobile.

BeeWare can solve the problem of Python on mobile, and Micropython has coverage over the embedded space. Both these projects should be given more attention and work so that Python can continue to be around for many years to come.

Is there anything else you’d like to say?

The Python community is wonderful. I’m only relatively new here, but the community has embraced me with open arms — the django community especially — and I feel at home here more than I have in any other community. It’s wonderful.

Thanks so much for doing the interview!

September 26, 2016 12:30 PM

September 25, 2016


François Dion

Something for your mind: Polymath Podcast Episode 001

Two topics will be covered:

Chipmusic, limitations and creativity

Numfocus (Open code = better science)


The numfocus interview was recorded at PyData Carolinas 2016. There will be a future episode covering the keynotes, tutorials, talks and lightning talks later this year. This interview was really more about open source and less about PyData.

The episode concludes with Learn more, on Claude Shannon and Harry Nyquist.

Something for your mind is available on

art·chiv.es
/'ärt,kīv/

at artchiv.es/s4ym/


Francois Dion
@f_dion

September 25, 2016 10:32 PM


Obey the Testing Goat

Plans for the second edition

The second edition was mostly prompted by the announcement by Mozilla that they were shutting down Persona in November 2016. Given that it would affect almost all the chapters from 15 thru to 21, it seemed a good excuse to do a full second edition rather than just an update.

Here, in brief, is an outline of the plan:

Chapter rewrites:

Minor updates + changes:

That's it, in very brief. You can read more on the google group, and feel free to join in the discussion there too, or here. Let me know what you think!

September 25, 2016 05:52 PM


Abu Ashraf Masnun

Introduction to Django Channels

Django is a brilliant web framework. In fact it is my most favourite one for various reasons. An year and a half ago, I switched to Python and Django for all my web development. I am a big fan of the eco system and the many third party packages. Particularly I use Django REST Framework whenever I need to create APIs. Having said that, Django was more than good enough for basic HTTP requests. But the web has changed. We now have HTTP/2 and web sockets. Django could not support them well in the past. For the web socket part, I usually had to rely on Tornado or NodeJS (with the excellent Socket.IO library). They are good technologies but most of my web apps being in Django, I really wished there were something that could work with Django itself. And then we had Channels. The project is meant to allow Django to support HTTP/2, websockets or other protocols with ease.

Concepts

The underlying concept is really simple - there are channels and there are messages, there are producers and there are consumers - the whole system is based on passing messages on to channels and consuming/responding to those messages.

Let’s look at the core components of Django Channels first:

How does it work?

A http request first comes to the Interface Server which knows how to deal with a specific type of request. For example, for websockets and http, Daphne is a popular interface server. When a new http/websocket request comes to the interface server (daphne in our case), it accepts the request and transforms it into a message. Then it passes the message to the appropriate channel. There are predefined channels for specific types. For example, all http requests are passed to http.request channel. For incoming websocket messages, there is websocket.receive. So these channels receive the messages when the corresponding type of requests come in to the interface server.

Now that we have channels getting filled with messages, we need a way to process these messages and take actions (if necessary), right? Yes! For that we write some consumer functions and register them to the channels we want. When messages come to these channels, the consumers are called with the message. They can read the message and act on them.

So far, we have seen how we can read an incoming request. But like all web applications, we should write something back too, no? How do we do that? As it happens, the interface server is quite clever. While transforming the incoming request into a message, it creates a reply channel for that particular client request and registers itself to that channel. Then it passes the reply channel along with the message. When our consumer function reads the incoming message, it can pass a response to the reply channel attached with the message. Our interface server is listenning to that reply channel, remember? So when a response is sent back to the reply channel, the interface server grabs the message, transforms it into a http response and sends back to the client. Simple, no?

Writing a Websocket Echo Server

Enough with the theories, let’s get our hands dirty and build a simple echo server. The concept is simple. The server accepts websocket connections, the client writes something to us, we just echo it back. Plain and simple example.

Install Django & Channels
pip install channels

That should do the trick and install Django + Channels. Channels has Django as a depdency, so when you install channels, Django comes with it.

Create An App

Next we create a new django project and app -

django-admin.py startproject djchan
cd djchan
python manage.py startapp realtime
Configure INSTALLED_APPS

We have our Django app ready. We need to add channels and our django app (realtime) to the INSTALLED_APPS list under settings.py. Let’s do that:

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    "channels",
    "realtime"
]
Write our Consumer

After that, we need to start writing a consumer function that will process the incoming websocket messages and send back the response:

# consumers.py 
def websocket_receive(message):
    text = message.content.get('text')
    if text:
        message.reply_channel.send({"text": "You said: {}".format(text)})

The code is simple enough. We receieve a message, get it’s text content (we’re expecting that the websocket connection will send only text data for this exmaple) and then push it back to the reply_channel - just like we planned.

Channels Routing

We have our consume function ready, now we need to tell Django how to route messages to our consumer. Just like URL routing, we need to define our channel routings.

# routing.py
from channels.routing import route
from .consumers import websocket_receive
 
channel_routing = [
    route("websocket.receive", websocket_receive, path=r"^/chat/"),
]

The code should be self explanatory. We have a list of route objects. Here we select the channel name (websocket.receive => for receieving websocket messages), pass the consumer function and then configure the optional path. The path is an interesting bit. If we didn’t pass a value for it, the consumer will get all the messages in the websocket.receive channel on any URL. So if someone created a websocket connection to / or /private or /user/1234 - regardless of the url path, we would get all incoming messages. But that’s not our intention, right? So we restrict the path to /chat so only connections made to that url are handled by the consumer. Please note the beginning /, unlike url routing, in channels, we have to use it.

Configuring The Channel Layers

We have defined a consumer and added it to a routing table. We’re more or less ready. There’s just a final bit of configuration we need to do. We need to tell channels two things - which backend we want to use and where it can find our channel routing.

Let’s briefly talk about the backend. The messages and the channels - Django needs some sort of data store or message queue to back this system. By default Django can use in memory backend which keeps these things in memory but if you consider a distributed app, for scaling large, you need something else. Redis is a popular and proven piece of technology for these kinds of scenarios. In our case we would use the Redis backend.

So let’s install that:

pip install asgi_redis

And now we put this in our settings.py:

CHANNEL_LAYERS = {
    "default": {
        "BACKEND": "asgi_redis.RedisChannelLayer",
        "CONFIG": {
            "hosts": [("localhost", 6379)],
        },
        "ROUTING": "realtime.routing.channel_routing",
    },
}
Running The Servers

Make sure that Redis is running (usually redis-server should run it). Now run the django app:

python manage.py runserver

In local environment, when you do runserver - Django launches both the interface server and necessary background workers (to run the consumer functions in the background). But in production, we should run the workers seperately. We will get to that soon.

Trying it Out!

Once our dev server starts up, let’s open up the web app. If you haven’t added any django views, no worries, you should still see the “It Worked!” welcome page of Django and that should be fine for now. We need to test our websocket and we are smart enough to do that from the dev console. Open up your Chrome Devtools (or Firefox | Safari | any other browser’s dev tools) and navigate to the JS console. Paste the following JS code:


socket = new WebSocket("ws://" + window.location.host + "/chat/");
socket.onmessage = function(e) {
    alert(e.data);
}
socket.onopen = function() {
    socket.send("hello world");
}

If everything worked, you should get an alert with the message we sent. Since we defined a path, the websocket connection works only on /chat/. Try modifying the JS code and send a message to some other url to see how they don’t work. Also remove the path from our route and see how you can catch all websocket messages from all the websocket connections regardless of which url they were connected to. Cool, no?

Our Custom Channels

We have seen that certain protocols have predefined channels for various purposes. But we are not limited to those. We can create our own channels. We don’t need to do anything fancy to initialize a new channel. We just need to mention a name and send some messages to it. Django will create the channel for us.

Channel("thumbnailer").send({
        "image_id": image.id
    })

Of course we need corresponding workers to be listenning to those channels. Otherwise nothing will happen. Please note that besides working with new protocols, Channels also allow us to create some sort of message based task queues. We create channels for certain tasks and our workers listen to those channels. Then we pass the data to those channels and the workers process them. So for simpler tasks, this could be a nice solution.

Scaling Production Systems

Running Workers Seperately

On a production environment, we would want to run the workers seperately (since we would not run runserver on production anyway). To run the background workers, we have to run this command:

python manage.py runworker
ASGI & Daphne

In our local environment, the runserver command took care of launching the Interface server and background workers. But now we have to run the interface server ourselves. We mentioned Daphne already. It works with the ASGI standard (which is commonly used for HTTP/2 and websockets). Just like wsgi.py, we now need to create a asgi.py module and configure it.

import os
from channels.asgi import get_channel_layer

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "djchan.settings")

channel_layer = get_channel_layer()

Now we can run the server:

daphne djchan.asgi:channel_layer

If everything goes right, the interface server should start running!

ASGI or WSGI

ASGI is still new and WSGI is a battle tested http server. So you might still want to keep using wsgi for your http only parts and asgi for the parts where you need channels specific features.

The popular recommendation is that you should use nginx or any other reverse proxies in front and route the urls to asgi or uwsgi depending on the url or Upgrade: WebSocket header.

Retries and Celery

The Channels system does not gurantee delivery. If there are tasks which needs the certainity, it is highly recommended to use a system like Celery for these parts. Or we can also roll our own checks and retry logic if we feel like that.

September 25, 2016 03:27 PM


Krzysztof Żuraw

Archives from memory- libarchive

This blog post is about python wrapper around libarchive and how to use it to generate archive from memory.

Libarchive & python-libarchive-c

If you happen to learn more about how to create archives in various formats like tar, iso or zip I bet you heard about libarchive. It is widely used archive library written in C.

To use it within python you can choose from a few libraries but one that is currently maintained is called python-libarchive-c. When in my work I was to implement the feature of adding entries to archive from memory I decided to use existing module and give something back to a community in form of an open source contribution.

Add entry from memory

To make such a feature I have to reread carefully code examples in libarchive c itself. I also get familiar with few archive formats and their limitations. But enough talking lets jump to the code:

import requests
import libarchive

def create_archive_from_memory_file():
    response = requests.get('link', stream=True)

    with libarchive.file_writer('archive.zip', 'zip') as archive:
        archive.add_file_from_memory(
            entry_path='filename',
            entry_size=int(response.headers['Content-Length']),
            entry_data=response.iter_content(chunk_size=1024)
        )

if __name__ == '__main__':
    create_archive_from_memory_file()

My changes in code have not been released so make sure that you install python-libarchive-c from github like this (to run this script you also need requests library):

$ pip install git+https://github.com/Changaco/python-libarchive-c

In this snippet, I use request feature that doesn't require loading the whole content of the response to memory but instead I add the argument: stream=True and then I use response.iter_content(chunk_size=1024). Rest of the code is simply calling add_file_from_memory with a path (entry_path) and size of the entry in an archive (entry_size).

Under the hood, python-libarchive-c is using c_types with ffi to call libarchive functions. At first, it setup path to entry then sets its size, filetype and permission which file will be saved in the archive. Then write the header and start iterating through the entry_data by chunks and write them. At the end, header is set and archive is ready for user.

To see it in action have snippet above as example.py and run this script:

$ python example.py
$ ls -la
-rw-r--r--. 1 kzuraw kzuraw 11M 09-24 13:04 archive.zip
-rw-rw-r--. 1 kzuraw kzuraw 511 09-24 12:59 example.py

That's all for this week. Feel free to comment and if you have any questions don't hesitate to ask them.

Special thanks to Kasia for being editor for this post. Thank you.

Cover image by Archivo-FSP under CC BY-SA 3.0.

September 25, 2016 07:00 AM


Python 4 Kids

Python for Kids: Python 3 – Project 10

Using Python 3 in Project 10 of Python For Kids For Dummies

In this post I talk about the changes that need to be made to the code of Project 10 of my book Python for Kids for Dummies in order for it to work with Python 3. The main difference between the Python 2.7 and Python 3 code for this project is that Python 3 uses raw_input and that has been renamed to input in Python 3. Most of the code in project 10 will work with this one change. However, in a lot of cases what Python outputs in Python 3 is different from the output in Python 2.7. This project has a lot of code. In order to shorten the length of this post I am only showing the Python 3 versions of the longer pieces (rather than both Python 2.7 (from the book) and Python 3). Look at the book to see the Python 2.7 code (it’s very similar).

Disclaimer

Some people want to use my book Python for Kids for Dummies to learn Python 3. I am working through the code in the existing book, highlighting changes from Python 2 to Python 3 and providing code that will work in Python 3. If you are using Python 2.7 you can ignore this post. This post is only for people who want to take the code in my book Python for Kids for Dummies and run it in Python 3.

######## Page 283

The code on this page uses raw_input, which has been renamed to input in Python 3. You can either replace all occurrences of raw_input with input or add a line:

raw_input = input 

at the start of the relevant code. In order to reduce the amount of code being repeated, I am adding raw_input = input to the Constants section of the code. You will need to remember that all of the later code assumes that this line has been added.

  
"""
Python 2.7
math_trainer.py
Train your times tables.
Initial Features:
* Print out times table for a given number.
* Limit tables to a lower number (default is 1)
and an upper number (default is 12).
* Pose test questions to the user
* Check whether the user is right or wrong
* Track the user's score.
Brendan Scott
February 2015
"""
#### Constants Section
TEST_QUESTION = (4, 6)
QUESTION_TEMPLATE = "What is %sx%s? "
#### Function Section
#### Testing Section
question = TEST_QUESTION
prompt = QUESTION_TEMPLATE%question
correct_answer = question[0]*question[1] # indexes start from 0
answer = raw_input(prompt)
if int(answer)== correct_answer:
    print("Correct!")
else:
    print("Incorrect")

>>> ================================ RESTART ================================
>>>
What is 4x6? 24
Correct!
>>> ================================ RESTART ================================
>>>
What is 4x6? 25
Incorrect

     

"""
Python 3
math_trainer.py
Train your times tables.
Initial Features:
* Print out times table for a given number.
* Limit tables to a lower number (default is 1)
and an upper number (default is 12).
* Pose test questions to the user
* Check whether the user is right or wrong
* Track the user's score.
Brendan Scott
February 2015
"""
#### Constants Section
raw_input = input # this line added
TEST_QUESTION = (4, 6)
QUESTION_TEMPLATE = "What is %sx%s? "

#### Function Section

#### Testing Section
question = TEST_QUESTION
prompt = QUESTION_TEMPLATE%question
correct_answer = question[0]*question[1] # indexes start from 0
answer = raw_input(prompt)
if int(answer)== correct_answer:
    print("Correct!")
else:
    print("Incorrect")

>>> ================================ RESTART ================================
>>>
What is 4x6? 24
Correct!
>>> ================================ RESTART ================================
>>>
What is 4x6? 25
Incorrect

######## Page 286-296

All code on this page is the same, and all outputs from the code is the same in Python 3 as in Python 2.7.
Remember that for the code to work in Python 3 code an additional line

raw_input = input

as added in the Constants section of the code.

######## Page 297

All code on this page is the same, and all outputs from the code is the same in Python 3 as in Python 2.7.

######## Page 298
The code in this section is different in Python 2.7 v Python 3.
The Python 2.7 code assumed that there was a list and that a while loop repeatedly removed things from that list. When everything was removed then the loop stopped. This was achieved by a test

batch != []

that is, stop when the variable batch is an empty list.
Ultimately, what is in batch comes from a call to the range builtin:

tables_to_print = range(1, upper+1)

In Python 2.7 this is a list which is generated in full and stored in tables_to_print. In Python 3 it’s not. Rather the range builtin generates the values that are needed at the time they are needed – not before. In Python 3 batch is a “range object”, not a list. And, while batch gets shorter and shorter, it’s never going to be an empty list (it would need to stop being a range and start being a list), no matter how long the program runs. To get this code working in Python 2.7 you can either:
(A) explicitly make batch a list by changing the line:

tables_to_print = range(1, upper+1)

to

tables_to_print = list(range(1, upper+1))

this changes all the relevant variables (and, in particular batch) into lists so the condition in the while loop will evaluate as you expect; or

(B) change the condition in the while loop to check the length of batch rather than whether or not it is an empty list. That is change:

 
    while batch != []: # stop when there's no more to print

to

 
    while len(batch) > 0: # stop when there's no more to print

That is, once the length is 0 (ie no more elements to display), stop the loop. I think this is the better of the two options because it makes the test independent of the type of variable used to keep track of batches.

Remember that the Python 3 code has an additional line

 
raw_input = input

in the Constants section of the code.

 
#Python 2.7

TIMES_TABLE_ENTRY = "%2i x %2i = %3i "

def display_times_tables(upper=UPPER):
    """
    Display the times tables up to UPPER
    """
    tables_per_line = 5
    tables_to_print = range(1, upper+1)
    # get a batch of 5 to print
    batch = tables_to_print[:tables_per_line]
    # remove them from the list
    tables_to_print = tables_to_print[tables_per_line:]
    while batch != []: # stop when there's no more to print
        for x in range(1, upper+1):
            # this goes from 1 to 12 and is the rows
            accumulator = []
            for y in batch:
                # this covers only the tables in the batch
                # it builds the columns
                accumulator.append(TIMES_TABLE_ENTRY%(y, x, x*y))
            print("".join(accumulator)) # print one row
        print("\n") # vertical separation between blocks of tables.
        # now get another batch and repeat.
        batch = tables_to_print[:tables_per_line]
        tables_to_print = tables_to_print[tables_per_line:]

     

#Python 3                            
TIMES_TABLE_ENTRY = "%2i x %2i = %3i "

def display_times_tables(upper=UPPER):
    """
    Display the times tables up to UPPER
    """
    tables_per_line = 5
    tables_to_print = list(range(1, upper+1))
    # get a batch of 5 to print
    batch = tables_to_print[:tables_per_line]
    # remove them from the list
    tables_to_print = tables_to_print[tables_per_line:]
    while len(batch)>0: # stop when there's no more to print
        for x in range(1, upper+1):
            # this goes from 1 to 12 and is the rows
            accumulator = []
            for y in batch:
                # this covers only the tables in the batch
                # it builds the columns
                accumulator.append(TIMES_TABLE_ENTRY%(y, x, x*y))
            print("".join(accumulator)) # print one row
        print("\n") # vertical separation between blocks of tables.
        # now get another batch and repeat.
        batch = tables_to_print[:tables_per_line]
        tables_to_print = tables_to_print[tables_per_line:]
        

######## Page 302, 304
All code on this page is the same, and all outputs from the code is the same in Python 3 as in Python 2.7.
Remember that the Python 3 code has an additional line

   
raw_input = input 

in the Constants section of the code.

######## Page 305-306

All code on this page is the same, and all outputs from the code is the same in Python 3 as in Python 2.7.

######## Page 307
All code on this page is the same, and all outputs from the code is the same in Python 3 as in Python 2.7.
Remember that the Python 3 code has an additional line

   
raw_input = input

in the Constants section of the code.

#########################################
### Full Code:
#########################################

The code in this section is different in Python 2.7 v Python 3.
The Python 3 code has an additional line
raw_input = input
in the Constants section of the code and the line

   
    while batch != []: # stop when there's no more to print

has been changed to

   
    while len(batch) > 0: # stop when there's no more to print
     
"""
math_trainer.py
Train your times tables.
Initial Features:
* Print out times table for a given number.
* Limit tables to a lower number (default is 1) and
an upper number (default is 12).
* Pose test questions to the user
* Check whether the user is right or wrong
* Track the user's score.
Brendan Scott
February 2015
"""

#### Imports Section
import random
import sys
import time

#### Constants Section
TEST_QUESTION = (4, 6)
QUESTION_TEMPLATE = "What is %sx%s? "
LOWER = 1
UPPER = 12
MAX_QUESTIONS = 10 # for testing, you can increase it later
TIMES_TABLE_ENTRY = "%2i x %2i = %3i "

INSTRUCTIONS = """Welcome to Math Trainer
This application will train you on your times tables.
It can either print one or more of the tables for you
so that you can revise (training) or you it can test
you on your times tables.
"""
CONFIRM_QUIT_MESSAGE = 'Are you sure you want to quit (Y/n)? '
SCORE_TEMPLATE = "You scored %s (%i%%) in %.1f seconds"

#### Function Section
def make_question_list(lower=LOWER, upper=UPPER, random_order=True):
    """ prepare a list of questions in the form (x,y)
    where x and y are in the range from LOWER to UPPER inclusive
    If random_order is true, rearrange the questions in a random
    order
    """
    spam = [(x+1, y+1) for x in range(lower-1, upper)
                       for y in range(lower-1, upper)]
    if random_order:
        random.shuffle(spam)
    return spam

def display_times_tables(upper=UPPER):
    """
    Display the times tables up to UPPER
    """
    tables_per_line = 5
    tables_to_print = range(1, upper+1)
    # get a batch of 5 to print
    batch = tables_to_print[:tables_per_line]
    # remove them from the list 
    tables_to_print = tables_to_print[tables_per_line:]
    while batch != []: # stop when there's no more to print
        for x in range(1, upper+1):
            # this goes from 1 to 12 and is the rows 
            accumulator = []
            for y in batch:
                # this covers only the tables in the batch
                # it builds the columns
                accumulator.append(TIMES_TABLE_ENTRY%(y, x, x*y))
            print("".join(accumulator)) # print one row
        print("\n") # vertical separation between blocks of tables.
        # now get another batch and repeat. 
        batch = tables_to_print[:tables_per_line]
        tables_to_print = tables_to_print[tables_per_line:]

    
def do_testing():
    """ conduct a round of testing """
    question_list = make_question_list()
    score = 0
    start_time = time.time()
    for i, question in enumerate(question_list):
        if i >= MAX_QUESTIONS:
            break
        prompt = QUESTION_TEMPLATE%question
        correct_answer = question[0]*question[1]
        # indexes start from 0
        answer = raw_input(prompt)

        if int(answer) == correct_answer:
            print("Correct!")
            score = score+1
        else:
            print("Incorrect, should have "+\
                  "been %s"%(correct_answer))

    end_time = time.time()
    time_taken = end_time-start_time
    percent_correct = int(score/float(MAX_QUESTIONS)*100)
    print(SCORE_TEMPLATE%(score, percent_correct, time_taken))

def do_quit():
    """ quit the application"""
    if confirm_quit():
        sys.exit()
    print("In quit (not quitting, returning)")

def confirm_quit():
    """Ask user to confirm that they want to quit
    default to yes 
    Return True (yes, quit) or False (no, don't quit) """
    spam = raw_input(CONFIRM_QUIT_MESSAGE)
    if spam == 'n':
        return False
    else:
        return True    


#### Testing Section

#do_testing()
##display_times_tables()

#### Main Section

if __name__ == "__main__":
    while True:
        print(INSTRUCTIONS)
        raw_input_prompt = "Press: 1 for training,"+\
                           " 2 for testing, 3 to quit.\n"
        selection = raw_input(raw_input_prompt)
        selection = selection.strip()
        while selection not in ["1", "2", "3"]:
            selection = raw_input("Please type either 1, 2, or 3: ")
            selection = selection.strip()

        if selection == "1":
            display_times_tables()
        elif selection == "2":
            do_testing()
        else:  # has to be 1, 2 or 3 so must be 3 (quit)
            do_quit()

     

"""
math_trainer.py (Python 3)
Train your times tables.
Initial Features:
* Print out times table for a given number.
* Limit tables to a lower number (default is 1) and
an upper number (default is 12).
* Pose test questions to the user
* Check whether the user is right or wrong
* Track the user's score.
Brendan Scott
February 2015
"""

#### Imports Section
import random
import sys
import time

#### Constants Section
raw_input = input
TEST_QUESTION = (4, 6)
QUESTION_TEMPLATE = "What is %sx%s? "
LOWER = 1
UPPER = 12
MAX_QUESTIONS = 10 # for testing, you can increase it later
TIMES_TABLE_ENTRY = "%2i x %2i = %3i "

INSTRUCTIONS = """Welcome to Math Trainer
This application will train you on your times tables.
It can either print one or more of the tables for you
so that you can revise (training) or you it can test
you on your times tables.
"""
CONFIRM_QUIT_MESSAGE = 'Are you sure you want to quit (Y/n)? '
SCORE_TEMPLATE = "You scored %s (%i%%) in %.1f seconds"

#### Function Section
def make_question_list(lower=LOWER, upper=UPPER, random_order=True):
    """ prepare a list of questions in the form (x,y)
    where x and y are in the range from LOWER to UPPER inclusive
    If random_order is true, rearrange the questions in a random
    order
    """
    spam = [(x+1, y+1) for x in range(lower-1, upper)
                       for y in range(lower-1, upper)]
    if random_order:
        random.shuffle(spam)
    return spam

def display_times_tables(upper=UPPER):
    """
    Display the times tables up to UPPER
    """
    tables_per_line = 5
    tables_to_print = range(1, upper+1)
    # get a batch of 5 to print
    batch = tables_to_print[:tables_per_line]
    # remove them from the list 
    tables_to_print = tables_to_print[tables_per_line:]
    while len(batch) > 0: # stop when there's no more to print
        for x in range(1, upper+1):
            # this goes from 1 to 12 and is the rows 
            accumulator = []
            for y in batch:
                # this covers only the tables in the batch
                # it builds the columns
                accumulator.append(TIMES_TABLE_ENTRY%(y, x, x*y))
            print("".join(accumulator)) # print one row
        print("\n") # vertical separation between blocks of tables.
        # now get another batch and repeat. 
        batch = tables_to_print[:tables_per_line]
        tables_to_print = tables_to_print[tables_per_line:]

    
def do_testing():
    """ conduct a round of testing """
    question_list = make_question_list()
    score = 0
    start_time = time.time()
    for i, question in enumerate(question_list):
        if i >= MAX_QUESTIONS:
            break
        prompt = QUESTION_TEMPLATE%question
        correct_answer = question[0]*question[1]
        # indexes start from 0
        answer = raw_input(prompt)

        if int(answer) == correct_answer:
            print("Correct!")
            score = score+1
        else:
            print("Incorrect, should have "+\
                  "been %s"%(correct_answer))

    end_time = time.time()
    time_taken = end_time-start_time
    percent_correct = int(score/float(MAX_QUESTIONS)*100)
    print(SCORE_TEMPLATE%(score, percent_correct, time_taken))

def do_quit():
    """ quit the application"""
    if confirm_quit():
        sys.exit()
    print("In quit (not quitting, returning)")

def confirm_quit():
    """Ask user to confirm that they want to quit
    default to yes 
    Return True (yes, quit) or False (no, don't quit) """
    spam = raw_input(CONFIRM_QUIT_MESSAGE)
    if spam == 'n':
        return False
    else:
        return True    


#### Testing Section

#do_testing()
##display_times_tables()

#### Main Section

if __name__ == "__main__":
    while True:
        print(INSTRUCTIONS)
        raw_input_prompt = "Press: 1 for training,"+\
                           " 2 for testing, 3 to quit.\n"
        selection = raw_input(raw_input_prompt)
        selection = selection.strip()
        while selection not in ["1", "2", "3"]:
            selection = raw_input("Please type either 1, 2, or 3: ")
            selection = selection.strip()

        if selection == "1":
            display_times_tables()
        elif selection == "2":
            do_testing()
        else:  # has to be 1, 2 or 3 so must be 3 (quit)
            do_quit()


September 25, 2016 04:06 AM


Podcast.__init__

Episode 76 - PsychoPy with Jonathan Peirce

Summary

We’re delving into the complex workings of your mind this week on Podcast.__init__ with Jonathan Peirce. He tells us about how he started the PsychoPy project and how it has grown in utility and popularity over the years. We discussed the ways that it has been put to use in myriad psychological experiments, the inner workings of how to design and execute those experiments, and what is in store for its future.

Brief Introduction

Linode Sponsor Banner

Use the promo code podcastinit20 to get a $20 credit when you sign up!

Rollbar Logo

I’m excited to tell you about a new sponsor of the show, Rollbar.

One of the frustrating things about being a developer, is dealing with errors… (sigh)

  • Relying on users to report errors
  • Digging thru log files trying to debug issues
  • A million alerts flooding your inbox ruining your day…

With Rollbar’s full-stack error monitoring, you get the context, insights and control you need to find and fix bugs faster. It’s easy to get started tracking the errors and exceptions in your stack.You can start tracking production errors and deployments in 8 minutes - or less, and Rollbar works with all major languages and frameworks, including Ruby, Python, Javascript, PHP, Node, iOS, Android and more.You can integrate Rollbar into your existing workflow such as sending error alerts to Slack or Hipchat, or automatically create new issues in Github, JIRA, Pivotal Tracker etc.

We have a special offer for Podcast.__init__ listeners. Go to rollbar.com/podcastinit, signup, and get the Bootstrap Plan free for 90 days. That’s 300,000 errors tracked for free.Loved by developers at awesome companies like Heroku, Twilio, Kayak, Instacart, Zendesk, Twitch and more. Help support Podcast.__init__ and give Rollbar a try a today. Go to rollbar.com/podcastinit

Hired Logo

On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job.

Interview with Jonathan Peirce

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Summary We're delving into the complex workings of your mind this week on Podcast.__init__ with Jonathan Peirce. He tells us about how he started the PsychoPy project and how it has grown in utility and popularity over the years. We discussed the ways that it has been put to use in myriad psychological experiments, the inner workings of how to design and execute those experiments, and what is in store for its future.Brief IntroductionHello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.Hired is sponsoring us this week. If you're looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus.Once you land a job you can check out our other sponsor Linode for running your awesome new Python apps. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next projectYou want to make sure your apps are error-free so give our last sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us.Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience.Your hosts as usual are Tobias Macey and Chris PattiToday we're interviewing Jonathan Peirce about PsychoPy, an open source application for the presentation and collection of stimuli for psychological experimentation Use the promo code podcastinit20 to get a $20 credit when you sign up! I’m excited to tell you about a new sponsor of the show, Rollbar. One of the frustrating things about being a developer, is dealing with errors… (sigh)Relying on users to report errorsDigging thru log files trying to debug issuesA million alerts flooding your inbox ruining your day...With Rollbar’s full-stack error monitoring, you get the context, insights and control you need to find and fix bugs faster. It's easy to get started tracking the errors and exceptions in your stack.You can start tracking production errors and deployments in 8 minutes - or less, and Rollbar works with all major languages and frameworks, including Ruby, Python, Javascript, PHP, Node, iOS, Android and more.You can integrate Rollbar into your existing workflow such as sending error alerts to Slack or Hipchat, or automatically create new issues in Github, JIRA, Pivotal Tracker etc. We have a special offer for Podcast.__init__ listeners. Go to rollbar.com/podcastinit, signup, and get the Bootstrap Plan free for 90 days. That's 300,000 errors tracked for free.Loved by developers at awesome companies like Heroku, Twilio, Kayak, Instacart, Zendesk, Twitch and more. Help support Podcast.__init__ and give Rollbar a try a today. Go to rollbar.com/podcastinit On Hired software engineers designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus

September 25, 2016 12:15 AM


Nick Craig-Wood

Snake Puzzle Solver

My family know I like puzzles so they gave me this one recently:

Boxed Snake Puzzle

When you take it out the box it looks like this:

Solved Snake Puzzle

And very soon after it looked like this (which explains why I've christened the puzzle "the snake puzzle"):

Flat Snake Puzzle

The way it works is that there is a piece of elastic running through each block. On the majority of the blocks the elastic runs straight through, but on some of the it goes through a 90 degree bend. The puzzle is trying to make it back into a cube.

After playing with it a while, I realised that it really is quite hard so I decided to write a program to solve it.

The first thing to do is find a representation for the puzzle. Here is the one I chose:

# definition - number of straight bits, before 90 degree bend
snake = [3,2,2,2,1,1,1,2,2,1,1,2,1,2,1,1,2]
assert sum(snake) == 27

If you look at the picture of it above where it is flattened you can see where the numbers came from. Start from the right hand side.

That also gives us a way of calculating how many combinations there are. At each 90 degree joint, there are 4 possible rotations (ignoring the rotations of the 180 degree blocks) so there are:

4**len(snake)
17179869184

17 billion combinations. That will include some rotations and reflections, but either way it is a big number.

However it is very easy to know when you've gone wrong with this kind of puzzle - as soon as you place a piece outside of the boundary of the 3x3x3 block you know it is wrong and should try something different.

So how to represent the solution? The way I've chosen is to represent it as a 5x5x5 cube. This is larger than it needs to be but if we fill in the edges then we don't need to do any complicated comparisons to see if a piece is out of bounds. This is a simple trick but it saves a lot of code.

I've also chosen to represent the 3d structure not as a 3d array but as a 1D array (or list in python speak) of length 5 x 5 x 5 = 125.

To move in the x direction you add 1, to move in the y direction you add 5 and to move in the z direction you move 25. This simplifies the logic of the solver considerably - we don't need to deal with vectors.

The basic definitions of the cube look like this:

N = 5
xstride=1    # number of pieces to move in the x direction
ystride=N    # number of pieces to move in the y direction
zstride=N*N  # number of pieces to move in the z direction

In our list we will represent empty space with 0 and space which can't be used with -1:

empty = 0

Now define the empty cube with the boundary round the edges:

# Define cube as 5 x 5 x 5 with filled in edges but empty middle for
# easy edge detection
top = [-1]*N*N
middle = [-1]*5 + [-1,0,0,0,-1]*3 + [-1]*5
cube = top + middle*3 + top

We're going to want a function to turn x, y, z co-ordinates into an index in the cube list:

def pos(x, y, z):
    """Convert x,y,z into position in cube list"""
    return x+y*ystride+z*zstride

So let's see what that cube looks like:

def print_cube(cube, margin=1):
    """Print the cube"""
    for z in range(margin,N-margin):
        for y in range(margin,N-margin):
            for x in range(margin,N-margin):
                v = cube[pos(x,y,z)]
                if v == 0:
                    s = " . "
                else:
                    s = "%02d " % v
                print(s, sep="", end="")
            print()
        print()

print_cube(cube, margin = 0)
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1

-1 -1 -1 -1 -1
-1  .  .  . -1
-1  .  .  . -1
-1  .  .  . -1
-1 -1 -1 -1 -1

-1 -1 -1 -1 -1
-1  .  .  . -1
-1  .  .  . -1
-1  .  .  . -1
-1 -1 -1 -1 -1

-1 -1 -1 -1 -1
-1  .  .  . -1
-1  .  .  . -1
-1  .  .  . -1
-1 -1 -1 -1 -1

-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1

Normally we'll print it without the margin.

Now let's work out how to place a segment.

Assuming that the last piece was placed at position we want to place a segment of length in direction. Note the assert to check we aren't placing stuff on top of previous things, or out of the edges:

def place(cube, position, direction, length, piece_number):
    """Place a segment in the cube"""
    for _ in range(length):
        position += direction
        assert cube[position] == empty
        cube[position] = piece_number
        piece_number += 1
    return position

Let's just try placing some segments and see what happens:

cube2 = cube[:] # copy the cube
place(cube2, pos(0,1,1), xstride, 3, 1)
print_cube(cube2)
01 02 03
 .  .  .
 .  .  .

 .  .  .
 .  .  .
 .  .  .

 .  .  .
 .  .  .
 .  .  .
place(cube2, pos(3,1,1), ystride, 2, 4)
print_cube(cube2)
01 02 03
 .  . 04
 .  . 05

 .  .  .
 .  .  .
 .  .  .

 .  .  .
 .  .  .
 .  .  .
place(cube2, pos(3,3,1), zstride, 2, 6)
print_cube(cube2)
01 02 03
 .  . 04
 .  . 05

 .  .  .
 .  .  .
 .  . 06

 .  .  .
 .  .  .
 .  . 07

The next thing we'll need is to undo a place. You'll see why in a moment.

def unplace(cube, position, direction, length):
    """Remove a segment from the cube"""
    for _ in range(length):
        position += direction
        cube[position] = empty
unplace(cube2, pos(3,3,1), zstride, 2)
print_cube(cube2)
01 02 03
 .  . 04
 .  . 05

 .  .  .
 .  .  .
 .  .  .

 .  .  .
 .  .  .
 .  .  .

Now let's write a function which returns whether a move is valid given a current position and a direction and a length of the segment we are trying to place.

def is_valid(cube, position, direction, length):
    """Returns True if a move is valid"""
    for _ in range(length):
        position += direction
        if cube[position] != empty:
            return False
    return True
is_valid(cube2, pos(3,3,1), zstride, 2)
True
is_valid(cube2, pos(3,3,1), zstride, 3)
False

Given is_valid it is now straight forward to work out what moves are possible at a given time, given a cube with a position, a direction and a length we are trying to place.

# directions next piece could go in
directions = [xstride, -xstride, ystride, -ystride, zstride, -zstride]

def moves(cube, position, direction, length):
    """Returns the valid moves for the current position"""
    valid_moves = []
    for new_direction in directions:
        # Can't carry on in same direction, or the reverse of the same direction
        if new_direction == direction or new_direction == -direction:
            continue
        if is_valid(cube, position, new_direction, length):
            valid_moves.append(new_direction)
    return valid_moves
moves(cube2, pos(3,3,1), ystride, 2)
[-1, 25]

So that is telling us that you can insert a segment of length 2 using a direction of -xstride or zstride. If you look at previous print_cube() output you'll see those are the only possible moves.

Now we have all the bits to build a recursive solver.

def solve(cube, position, direction, snake, piece_number):
    """Recursive cube solver"""
    if len(snake) == 0:
        print("Solution")
        print_cube(cube)
        return
    length, snake = snake[0], snake[1:]
    valid_moves = moves(cube, position, direction, length)
    for new_direction in valid_moves:
        new_position = place(cube, position, new_direction, length, piece_number)
        solve(cube, new_position, new_direction, snake, piece_number+length)
        unplace(cube, position, new_direction, length)

This works by being passed in the snake of moves left. If there are no moves left then it must be solved, so we print the solution. Otherwise it takes the head off the snake with length, snake = snake[0], snake[1:] and makes the list of valid moves of that length.

Then we place each move, and try to solve that cube using a recursive call to solve. We unplace the move so we can try again.

This very quickly runs through all the possible solutions:

# Start just off the side
position = pos(0,1,1)
direction = xstride
length = snake[0]
# Place the first segment along one edge - that is the only possible place it can go
position = place(cube, position, direction, length, 1)
# Now solve!
solve(cube, position, direction, snake[1:], length+1)
Solution
01 02 03
20 21 04
07 06 05

16 15 14
19 22 13
08 11 12

17 24 25
18 23 26
09 10 27

Solution
01 02 03
16 15 14
17 24 25

20 21 04
19 22 13
18 23 26

07 06 05
08 11 12
09 10 27

Wow! It came up with 2 solutions! However they are the same solution just rotated and reflected.

But how do you use the solution? Starting from the correct end of the snake, place each piece into its corresponding number. Take the first layer of the solution as being the bottom (or top - whatever is easiest), the next layer is the middle and the one after the top.

Flat Snake Puzzle Numbered

After a bit of fiddling around you'll get...

Solved Snake Puzzle

I hope you enjoyed that introduction to puzzle solving with computer.

If you want to try one yourselves, use the same technique to solve solitaire.

September 25, 2016 12:00 AM

September 24, 2016


Hynek Schlawack

Sharing Your Labor of Love: PyPI Quick and Dirty

A completely incomplete guide to packaging a Python module and sharing it with the world on PyPI.

September 24, 2016 12:00 PM


Brian Okken

22: Converting Manual Tests to Automated Tests

How do you convert manual tests to automated tests? This episode looks at the differences between manual and automated tests and presents two strategies for converting manual to automated. Support Special thanks to my wonderful Patreon supporters and those who have supported the show by purchasing Python Testing with unittest, nose, pytest

The post 22: Converting Manual Tests to Automated Tests appeared first on Python Testing.

September 24, 2016 08:00 AM


Weekly Python StackOverflow Report

(xxxviii) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2016-09-24 07:49:04 GMT


  1. Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't? - [85/4]
  2. Remove the first N items that match a condition in a Python list - [46/6]
  3. Short-circuit evaluation like Python's "and" while storing results of checks - [21/14]
  4. Python Recursion Challenge - [13/5]
  5. How to maintain different country versions of same language in Django? - [8/1]
  6. Why does Python 3 exec() fail when specifying locals? - [8/1]
  7. Is there a difference between str function and percent operator in Python - [8/1]
  8. How to handle SQLAlchemy Connections in ProcessPool? - [8/0]
  9. Python vectorizing nested for loops - [7/2]
  10. Double loop takes time - [6/3]

September 24, 2016 07:49 AM

September 23, 2016


Simon Wittber

Python3 Asyncio PubSub Plaything


#!/usr/bin/env python
import io
import asyncio
import websockets
import logging
import collections

logger = logging.getLogger('websockets.server')
logger.setLevel(logging.ERROR)
logger.addHandler(logging.StreamHandler())
events = collections.defaultdict(lambda: set())
#-----------------------------------------------------------------------
async def handle_outgoing_queue(websocket):
    while websocket.open:
        msg = await websocket.outbox.get()
        await websocket.send(msg)
#-----------------------------------------------------------------------
async def pubsub(websocket, path):
    websocket.prefix = path.encode()
    websocket.outbox = asyncio.Queue()
    websocket.subscriptions = set()
    sender_task = asyncio.ensure_future(handle_outgoing_queue(websocket))
    while True:
        msg = await websocket.recv()
        if msg is None: break
        if isinstance(msg, str): msg = msg.encode()
        stream = io.BytesIO(msg)
        await handle_message(websocket, stream)
    sender_task.cancel()
    for name in websocket.subscriptions:
        try:
            events[name].remove(websocket)
        except KeyError:
            pass
#-----------------------------------------------------------------------
async def handle_message(websocket, stream):
    cmd = stream.readline().strip()
    name = websocket.prefix + stream.readline().strip()
    print(cmd, name);
    if cmd == b"SUB":
        events[name].add(websocket)
        websocket.subscriptions.add(name)
    elif cmd == b"UNS":
        subscribers = events[name]
        try:
            websocket.subscriptions.remove(name)
        except KeyError:
            pass
        try:
            subscribers.remove(websocket)
        except KeyError:
            pass
    elif cmd == b"PUB":
        stream.seek(0)
        msg = stream.read()
        for subscriber in events[name]:
            await subscriber.outbox.put(msg)
#-----------------------------------------------------------------------
start_server = websockets.serve(pubsub, '0.0.0.0', 8765)
asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()


September 23, 2016 09:08 PM