skip to navigation
skip to content

Planet Python

Last update: January 18, 2018 01:49 AM

January 17, 2018


Stack Abuse

Levenshtein Distance and Text Similarity in Python

Introduction

Writing text is a creative process that is based on thoughts and ideas which come to our mind. The way that the text is written reflects our personality and is also very much influenced by the mood we are in, the way we organize our thoughts, the topic itself and by the people we are addressing it to - our readers.

In the past it did happen that two or more authors had the same idea, wrote it down separately, published it under their name and created something that was very similar. Prior to electronic publications their ideas took a while to circulate and therefore led to conflicts about the real inventor and who is the one to be honoured for it.

Today, every article becomes immediately available online in a digital format. Online articles are indexed correctly and linked to other documents, which makes it easy to find them quickly. On the one hand this way of working simplifies the exchange of ideas as well as the research about a topic but on the other hand the accessibility opens doors to just copy and paste others work without permission or even referencing them, called plagiarism.

At this point methods come into play that deal with the similarity of two different texts. The main idea behind this is to be able to answer the questions if two texts (or datasets in general) are entirely or at least partly similar, if they are related to each other in terms of the same topic and how many edits have to be done to come from one text to the other one.

As an example, this technology is used by information retrieval systems, search engines, automatic indexing systems, text summarizers, categorization systems, plagiarism checkers, speech recognition, rating systems, DNA analysis, and profiling algorithms (IR/AI programs to automatically link data between people and what they do).

Search and Comparison Methods

All of us are familiar with searching a text for a specified word or character sequence (pattern). The goal is to either find the exact occurrence (match) or to find an in-exact match using characters with a special meaning, for example by regular expressions or by fuzzy logic. Mostly, it is a sequence of characters that is similar to another one.

Furthermore, the similarity can be measured by the way words sound - do they sound similar, but written in a different way? Invented in the U.S. to find relatives based on the different spelling of their surname, among others, the Soundex algorithm is still one of the most popular and widespread ones today.

Last but not least, how many changes (edits) are necessary to get from one word to the other one? The less edits to be done the higher is the similarity level. This category of comparison contains the Levenshtein distance that we will focus on in more detail below.

Table 1 covers a selection of ways to search and compare text data. The right column of the table contains a selection of the corresponding Python modules to achieve these tasks.

Category Method or Algorithm Python packages
exact search Boyer-Moore string search, Rabin-Karp string search, Knuth-Morris-Pratt (KMP), Regular Expressions string, re, Advas
in-exact search bigram search, trigram search, fuzzy logic Fuzzy
phonetic algorithms Soundex, Metaphone, Double Metaphone, Caverphone, NYIIS, Kölner Phonetik, Match Rating codex Advas, Fuzzy, jellyfish, phonetics, kph
changes or edits Levenshtein distance, Hamming distance, Jaro distance, Jaro-Winkler distance editdistance, python-Levenshtein, jellyfish

Table 1

The Levenshtein Distance

This method was invented in 1965 by the Russian mathematician Vladimir Levenshtein (1935-2017). The distance value describes the minimal number of deletions, insertions, or substitutions that are required to transform one string (the source) into another (the target). Unlike the Hamming distance, the Levenshtein distance works on strings with an unequal length.

The greater the Levenshtein distance, the greater are the difference between the strings. For example, from "test" to "test" the Levenshtein distance is 0 because both the source and target strings are identical. No transformations are needed. In contrast, from "test" to "team" the Levenshtein distance is 2 - two substitutions have to be done to turn "test" in to "team".

Here is a great video explaining how the algorithm works:

Implementing Levenshtein Distance in Python

For Python, there are quite a few different implementations available online [9,10] as well as from different Python packages (see table above). This includes versions following the Dynamic programming concept as well as vectorized versions. The version we show here is an iterative version that uses the NumPy package and a single matrix to do the calculations. As an example we would like to find out the edit distance between "test" and "text".

It starts with an empty matrix that has the size of the length of the strings. Both the first row and column, starting from zero, are indexed increasingly:

         t   e   s   t
  [[ 0.  1.  2.  3.  4.]
 t [ 1.  0.  0.  0.  0.]
 e [ 2.  0.  0.  0.  0.]
 x [ 3.  0.  0.  0.  0.]
 t [ 4.  0.  0.  0.  0.]]

Next, two loops follow to compare the strings letter by letter - row-wise, and column-wise. If two letters are equal, the new value at position [x, y] is the minimum between the value of position [x-1, y] + 1, position [x-1, y-1], and position [x, y-1] + 1.

[+0.] [+1.]
[+1.] [   ]

Otherwise, it is the minimum between the value of position [x-1, y] + 1, position [x-1, y-1] + 1, and position [x, y-1] + 1. Again, this can be visualized as a two by two sub-matrix where you are calculating the missing value in the bottom right position as below:

[+1.] [+1.]
[+1.] [   ]

Note there are three possible types of change if the two characters are different - insert, delete and substitute. Finally, the matrix looks as follows:

         t   e   s   t
  [[ 0.  1.  2.  3.  4.]
 t [ 1.  0.  1.  2.  3.]
 e [ 2.  1.  0.  1.  2.]
 x [ 3.  2.  1.  1.  2.]
 t [ 4.  3.  2.  1.  1.]]

The edit distance is the value at position [4, 4] - at the lower right corner - which is 1, actually. Note that this implementation is in O(N*M) time, for N and M the lengths of the two strings. Other implementations may run in less time but are more ambitious to understand.

Here is the corresponding code for the Levenshtein distance algorithm I just described:

import numpy as np

def levenshtein(seq1, seq2):  
    size_x = len(seq1) + 1
    size_y = len(seq2) + 1
    matrix = np.zeros ((size_x, size_y))
    for x in xrange(size_x):
        matrix [x, 0] = x
    for y in xrange(size_y):
        matrix [0, y] = y

    for x in xrange(1, size_x):
        for y in xrange(1, size_y):
            if seq1[x-1] == seq2[x-1]:
                matrix [x,y] = min(
                    matrix[x-1, y] + 1,
                    matrix[x-1, y-1],
                    matrix[x, y-1] + 1
                )
            else:
                matrix [x,y] = min(
                    matrix[x-1,y] + 1,
                    matrix[x-1,y-1] + 1,
                    matrix[x,y-1] + 1
                )
    print (matrix)
    return (matrix[size_x - 1, size_y - 1])

References

Acknowledgements

The author would like to thank
Axel Beckert, Mandy Neumeyer and Gerold Rupprecht for their support while preparing the article.

January 17, 2018 06:19 PM


Gocept Weblog

Catching and rendering exceptions

TL;DR: You have to write an exception view in file system code which is rendered when an exception occurs.

History

If an exception occurred in Zope 2 the standard_error_message (an object in the ZODB) was rendered. This way the error page could be customised through the web.

When using a WSGI server on Zope 2 the standard_error_message is no longer used. The exceptions have to be handled in a WSGI middleware. (This is a sub-optimal solution as the middleware is not run in the same execution context where the exception occurred.)

Thats why error handling changed again in Zope 4:  Like Zope 3 (aka BlueBream) Zope 4 tries to lookup an exception view if an exception occurs. If the lookup succeeds (aka there is an exception view registered for the current exception) this view is rendered as response. This approach allows different views for different exceptions. The standard_error_message is even gone when installing Zope 4 from scratch.

Base solution

The exception view has to be created in the file system and registered via ZCML. If you do not have a file system package yet – where this code can be placed – you can create a new package e. g. by using paster:

$ bin/pip install PasteScript
$ bin/paster create -t basic_package errorviews
$ bin/pip install -e errorviews

Where errorviews is the name of my example package.

In the existing errorviews/__init__.pyenter the following code:

class SiteErrorView(object):
    """View rendered on SiteError."""

    def __call__(self):
        return "SiteError!"

This view returns the string SiteError! instead of the standard error message. It has to be registered via ZCML. Write a file named configure.zcml right besides __init__.py:

<configure
  xmlns="http://namespaces.zope.org/zope"
  xmlns:browser="http://namespaces.zope.org/browser">

  <browser:page
    for="Exception"
    name="index.html"
    class=".SiteErrorView"
    permission="zope.Public"
    />

</configure>

The view gets registered for Exception and all classes inheriting from it. (This might be a bit too general for use in actual code, I know.) Zope 4 expects that the name is index.html. class  is a relative dotted path to the view class.

If you put these files into a Zope Product they could be picked automatically. If you created an errorviews package like me you have to register it in etc/site.zcml . Put the following line near the end before </configure>:

 <include package="errorviews" />

After re-starting Zope each exception renders as SiteError!

Using a PageTemplate

Writing HTML in a Python class is not very convenient. It is easier to use a PageTemplate to store the templating code. Create a file error.pt right beside __init__.py:

<html>
 <body>
   <h1>SiteError occurred</h1>
 </body>
</html>

In configure.zcml replace class=".SiteErrorView" by template="error.pt". (The view class is not used in this example.) After re-starting Zope each exception renders the HTML page.

Back to standard_error_message

If you have an existing Data.fs and want to re-use standard_error_message you might try the following hack: Change configure.zcml (back) to the XML code shown in the section “Base solution”.  Change __init__.py to the following content:

class SiteErrorView(object):
    """View rendered on SiteError."""

    def __call__(self):
        root = self.request['PARENTS'][-1]
        return root.standard_error_message(
            error_type=self.context.__class__.__name__,
            error_value=str(self.context))

The code in the example expects that standard_error_message  is a DTMLMethod (as provided by Zope 2.13). The arguments error_type  and error_value  allow using &dtml-error_type; and &dtml-error_value; in the DTMLMethod as before.

Conclusion

Zope 4 has a nice and flexible concept to render error pages. But it requires at least some Python code in the file system. Even this can be tricked back to standard_error_message if needed. But I think this should only be used for old applications or as an interim solution.

January 17, 2018 09:23 AM


Programiz

Reading CSV files in Python

In this article, we will learn how to read data from csv files in python of different formats.

January 17, 2018 09:02 AM

Python List Comprehension

In this article, we will learn about Python list comprehensions, and how to use it.

January 17, 2018 06:37 AM

Python Arrays

In this article, you’ll learn about Python array, how to create an array, find length of an array, access, slice, modify and remove elements of an array in Python.

January 17, 2018 05:55 AM


Kushal Das

How to configure Tor onion service on Fedora

You can set up a Tor onion service in a VM on your home desktop, or on a Raspberry Pi attached to your home network. You can serve any website, or ssh service using the same. For example, in India most of the time if an engineering student has to demo a web application, she has to demo on her laptop or on a college lab machine. If you set up your web application project as an onion service, you can actually make it available to all of your friends. You don’t need an external IP or special kind of Internet connection or pay for a domain name. Of course, it may be slower than all the fancy website you have, but you don’t have to spend any extra money for this.

In this post, I am going to talk about how can you set up your own service using a Fedora 26 VM. The similar steps can be taken in Raspberry Pi or any other Linux distribution.

Install the required packages

I will be using Nginx as my web server. The first step is to get the required packages installed.

$ sudo dnf install nginx tor
Fedora 26 - x86_64 - Updates                     10 MB/s |  20 MB     00:01
google-chrome                                    17 kB/s | 3.7 kB     00:00
Qubes OS Repository for VM (updates)             98 kB/s |  48 kB     00:00
Last metadata expiration check: 0:00:00 ago on Wed Jan 17 08:30:23 2018.
Dependencies resolved.
================================================================================
 Package                Arch         Version                Repository     Size
================================================================================
Installing:
 nginx                  x86_64       1:1.12.1-1.fc26        updates       535 k
 tor                    x86_64       0.3.1.9-1.fc26         updates       2.6 M
Installing dependencies:
 gperftools-libs        x86_64       2.6.1-5.fc26           updates       281 k
 nginx-filesystem       noarch       1:1.12.1-1.fc26        updates        20 k
 nginx-mimetypes        noarch       2.1.48-1.fc26          fedora         26 k
 torsocks               x86_64       2.1.0-4.fc26           fedora         64 k

Transaction Summary
================================================================================
Install  6 Packages

Total download size: 3.6 M
Installed size: 15 M
Is this ok [y/N]:

Configuring Nginx

After installing the packages, the next step is to setup the web server. For a quick example, we will just show the default Nginx index page over this web service. We will have to change the web server port to a different one in /etc/nginx/nginx.conf file. Please read about Nginx to know more about how to configure Nginx with your web application.

listen 8090 default_server;

Here we have the web server running on port 8090.

Configuring Tor

Next, we will set up the Tor onion service. The configuration file is located at /etc/tor/torrc. We will add the following two lines.

HiddenServiceDir /var/lib/tor/hidden_service/
HiddenServicePort 80 127.0.0.1:8090

We are redirecting port 80 in the onion service to the port 8090 in the same system.

Starting the services

Remember to open up port 80 in the firewall before starting the services. I am going to keep it an exercise for the reader to find out how :)

We will start nginx and tor service as the next step, you can also watch the system logs to find out status of Tor.

$ sudo systemctl start nginx
$ sudo systemctl start tor
$ sudo journalctl -f -u tor
-- Logs begin at Thu 2017-12-07 07:13:58 IST. --
Jan 17 08:33:43 tortest Tor[2734]: Bootstrapped 0%: Starting
Jan 17 08:33:43 tortest Tor[2734]: Signaled readiness to systemd
Jan 17 08:33:43 tortest systemd[1]: Started Anonymizing overlay network for TCP.
Jan 17 08:33:43 tortest Tor[2734]: Starting with guard context "default"
Jan 17 08:33:43 tortest Tor[2734]: Opening Control listener on /run/tor/control
Jan 17 08:33:43 tortest Tor[2734]: Bootstrapped 5%: Connecting to directory server
Jan 17 08:33:44 tortest Tor[2734]: Bootstrapped 10%: Finishing handshake with directory server
Jan 17 08:33:44 tortest Tor[2734]: Bootstrapped 15%: Establishing an encrypted directory connection
Jan 17 08:33:45 tortest Tor[2734]: Bootstrapped 20%: Asking for networkstatus consensus
Jan 17 08:33:45 tortest Tor[2734]: Bootstrapped 25%: Loading networkstatus consensus
Jan 17 08:33:55 tortest Tor[2734]: I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Jan 17 08:33:55 tortest Tor[2734]: Bootstrapped 40%: Loading authority key certs
Jan 17 08:33:55 tortest Tor[2734]: Bootstrapped 45%: Asking for relay descriptors
Jan 17 08:33:55 tortest Tor[2734]: I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/6009, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of exit bw = 0% of path bw.)
Jan 17 08:33:56 tortest Tor[2734]: Bootstrapped 50%: Loading relay descriptors
Jan 17 08:33:57 tortest Tor[2734]: Bootstrapped 56%: Loading relay descriptors
Jan 17 08:33:59 tortest Tor[2734]: Bootstrapped 65%: Loading relay descriptors
Jan 17 08:34:06 tortest Tor[2734]: Bootstrapped 72%: Loading relay descriptors
Jan 17 08:34:06 tortest Tor[2734]: Bootstrapped 80%: Connecting to the Tor network
Jan 17 08:34:07 tortest Tor[2734]: Bootstrapped 85%: Finishing handshake with first hop
Jan 17 08:34:07 tortest Tor[2734]: Bootstrapped 90%: Establishing a Tor circuit
Jan 17 08:34:08 tortest Tor[2734]: Tor has successfully opened a circuit. Looks like client functionality is working.
Jan 17 08:34:08 tortest Tor[2734]: Bootstrapped 100%: Done

There will be a private key and the hostname file for the onion service in the /var/lib/tor/hidden_service/ directory. Open up Tor browser, and visit the onion address. You should be able to see a page like below screenshot.

Remember to backup the private key file if you want to keep using the same onion address for a longer time.

What all things can we do with this onion service?

That actually depends on your imagination. Feel free to research about what all different services can be provided over Tor. You can start with writing a small Python Flask web application, and create an onion service for the same. Share the address with your friends.

Ask your friends to use Tor browser for daily web browsing. The more Tor traffic we can generate, the more difficult it will become for the nation-state actors to try to monitor traffics, that in turn will help the whole community.

WARNING on security and anonymous service

Remember that this tutorial is only for quick demo purpose. This will not make your web server details or IP or operating system details hidden. You will have to make sure of following proper operational security practices along with system administration skills. Riseup has a page describing best practices. But, please make sure that you do enough study and research before you start providing long-term services over the Tor.

Also please remember that Tor is developed and run by people all over the world and the project needs donation. Every little bit of help counts.

January 17, 2018 05:53 AM


Montreal Python User Group

Montréal-Python 69 - Call For Proposals

First of all, the Montreal-Python team would like to wish you a Happy New Year!

With every new year, there's resolutions made. If presenting at a the tech event is on your list of resolutions, here's your chance to cross it off your list early. Montreal Python opens the call for presentations for our events of 2018. ( Feel free to submit your proposal - whether you have resolutions or not ;) )

Send us your proposal at team@montrealpython.org. We have spots for lightning talks (5 min) or regular talks (15 to 30 min)

When

February 5th, 2018 at 6:00PM

Where

TBD

January 17, 2018 05:00 AM

January 16, 2018


Tarek Ziade

My trip in Cuba

I spent 3 weeks in Cuba with my family after the Mozilla All-Hands we had in December in Austin, and had the opportunity to meet a local user group (Cuban Tech Group) and do a hackaton with them.

The #cubantech group during the December hackaton on #micropython

Une publication partagée par Tarek Ziadé (@tarek.ziade) le 17 Janv. 2018 à 12 :13 PST

Olemis Lang is one of the founders and very active in promoting open source in Cuba. We’ve had some similar experiences in running user groups (I founded the Python french one a decade ago), and were excited about sharing our experience.

Olemis in Havana Lighthouse

Une publication partagée par Tarek Ziadé (@tarek.ziade) le 17 Janv. 2018 à 12 :10 PST

One annoying thing in Cuba is the lack of internet access. There’s basically no internet ISP in the country besides the stated-owned company (etesca) that runs wifi hotspots in parks. If you want to access internet, you have to buy 1h or 2h cards for a few euros and sit down in a park as close as possible from the hotspot source. It’s easy to find those hotspots: you will see group of teens with their smartphones. If you recall the craziness around Pokemon Go last summer, with people gathering around Pokemon Arenas, that’s basically it.

Stunning sight from El Mirador near Soroa, Cuba

Une publication partagée par Tarek Ziadé (@tarek.ziade) le 8 Janv. 2018 à 12 :40 PST

In most countries I’ve visited, we take internet for granted, and that’s been true for many years. We’ve dumped a big chunk of our memory back into to google/duckduckgo/qwant/xxx because it’s easier to look back for a piece of information rather than remembering it. We all have stories of finding back our own blog posts on the internet when looking for an answer.

Case in point: when we decided to do a hackaton around MicroPython, Olemis and myself took some of time to make sure we had all the required material on local disks. But we spent most of the evening trying to make things work. One person missed a tiny debian package, or we missed one GitHub repo with a specific file, etc.

NodeMCU boards flashed with #Micropython #cubantech

Une publication partagée par Tarek Ziadé (@tarek.ziade) le 17 Janv. 2018 à 12 :11 PST

Although I was quite impressed by Olemis’ setup to mitigate the lack of internet. A few raspberry pis here and there, a SAN, a local mirror of stack overflow, of ubuntu packages, a local git server with clones etc, and a proxy cache to visit a few websites.

We had a lot of fun (the “YEAAAAAH” when the Micropython board finally lits the LED is always cool), but everyone was frustrated by the lack of connection. Even with my crappy DSL back at home in France, I felt like that rich spoiled kid with all the toys, compared to the other kids.

Olemis introduces Micropython #cubantech

Une publication partagée par Tarek Ziadé (@tarek.ziade) le 17 Janv. 2018 à 12 :12 PST

With the current state of internet in Cuba (I’ve heard it should improve), how can we expect Cuban devs to participate to open source projects as much as they would like ?

A patch for Firefox ? forget it, look at the size of the mercurial repo, it would take them weeks to get it and days to update it (and $$$). A PR in that GitHub repo? that’s possible, but with all the back and forth before it’s pushed, it’s going to costs a few internet cards.

Would you pay 5 euros to fix a bug in a project ?

There’s no way to fix this, they will have to wait for the Cuban government to allow private internet access, or drastically improve the public one, maybe by creating open spaces with better internet access.

But in the interim (=in the next 5 years), I think there’s something we can do to help them. We can send them data.

It’s easy for people with a good connection to fill an USB disk with :

I want to try to do something about this - not only for Cuba, but for any other place with the same issue.

I am working on defining how we could build an online service to send USB sticks of public content on a regular basis, containing the data developers needs to play with OSS projects.

Maybe this could fit Mozilla’s MOSS - https://www.mozilla.org/en-US/moss/ , or maybe something like that already exists. Digging…

January 16, 2018 11:00 PM


Weekly Python Chat

Pythonic For Loops

Python's for loops are fairly different from for loops in languages like C, Java, and JavaScript. In this chat we'll discuss how Python's for loops force you to think differently about your code and we'll review some of the tools you can use to keep your for loops readable and Pythonic.

January 16, 2018 09:30 PM


py.CheckIO

Variations of Data Classes

Hi Planet Python!

This is the first article in our blog this year and the last one of the Python Data Types series (before we were talking about arrays and dicts). We went through the available data classes in Python, such as collections.namedtuple, typing.NamedTuple, types.SimpleNamespace, and finally @dataclass decorator that was added to Python 3.7 (which py.CheckiO already supports).

PS: As always article is presented with code examples from CheckiO Players.

January 16, 2018 02:29 PM


Python Bytes

#61 On Being a Senior Engineer

January 16, 2018 08:00 AM


Codementor

Code Python on ArchLinux

A tutorial about how to code in python on the ArchLinux platform

January 16, 2018 06:41 AM


Kushal Das

Do not limit yourself

This post is all about my personal experience in life. The random things I am going to write in this post, I’ve talked about in many 1x1 talks or chats. But, as many people asked for my view, or suggestions on the related topics, I feel I can just write all them down in one single place. If you already get the feeling that this post will be a boring one, please feel free to skip. There is no tl;dr version of it from me.

Why the title?

To explain the title of the post, I will go back a few years in my life. I grew up in a coal mine area of West Bengal, studied in the village’s Bengali medium school. During school days, I was very much interested in learning about Science, and kept doing random experiments in real life to learn things. They were fun. And I learned life lessons from those. Most of my friends, school teachers or folks I knew, kept telling me that those experiments were impossible, or they were beyond my reach. I was never a class topper, but once upon a time I wanted to participate in a science exam, but the school teacher in charge told me that I was not good enough for it. After I kept asking for hours, he finally said he will allow me, but I will have to get the fees within the next hour. Both of my parents were working, so no chance of getting any money from them at that moment. An uncle who used to run one of the local book stores then lent me the money so that I could pay the fees. The amount was very small, but the teacher knew that I didn’t get any pocket money. So, asking for even that much money within an hour was a difficult task. I didn’t get a high score in that examination, but I really enjoyed the process of going to a school far away and taking the exam (I generally don’t like taking written exams).

College days

During college days I spent most of my time in front of my computer at the hostel, or in the college computer labs. People kept laughing at me for the same, batchmates, juniors, seniors, or sometimes even professors. But, at the same time I found a few seniors and friends, and professors who kept encouraging whatever I did. The number of people laughing at me were always higher. Because of the experience during school days, I managed to ignore those.

Coming to the recent years

The trend continued through out my working life. There are always more people who kept laughing at everything I do. They kept telling me that the things I try to do, do not have any value and beyond my limit. I don’t see myself as one of those bright developers I meet out in the world. I kept trying to do things I love, tried to help the community whichever way possible. What ever I know, I learned because someone else took time to teach me, took time to explain it to me. Now, I keep hearing the similar stories from many young contributors, my friends, from India. Many times I saw how people kept laughing at my friends in the same way they do at me. They kept telling my friends that the things they are trying to achieve are beyond their limit. I somehow managed to meet many positive forces in my life, and I keep meeting the new ones. This helped me to put in my mind that we generally bound ourselves in some artificial limits. Most of the folks laughing at us, never tried anything in life. It is okay if we can not write or speak the perfect English like them, English is not our primary language anyway. We can communicate as required. The community out there welcomes everyone as they are. We don’t have to invent the next best programming language, or be the super rich startup person to have good friends in life. One can always push at personal level, to learn new things. To do things which makes sense to each of us. That maybe is totally crazy in other people’s life. But, it is okay to try things as you like. Once upon a time, during a 1x1 with my then manager (and lifelong mentor) Sankarshan Mukhopadhyay, he told me something which remained with me very strong to this day. We were talking about things I can do, or rather try to do. By taking another example of one of my good friends from Red Hat, he explained to me that I may think that my level is nowhere near to this friend. But, if I try to learn and do things like him, I may reach 70% level, or 5% or 50%. Who knows unless I try doing those new things. While talking about hiring for the team, he also told me about how we should always try to get people who are better than us, that way, we always will be in a position to learn from each other I guess those words together changed many things in my life. The world is too large, and we all can do things in our life at certain level. But, what we can do depends on where we draw those non-existing limits in our lives.

The Python community is one such example, when I went to PyCon US for the first time in 2013, the community welcomed me the way I am. Even though almost no one knew me, I never felt that while meeting and talking to my life time heroes. Funny that in the same conference, a certain senior person from India tried to explain that I should start behaving like a senior software engineer. I should stand in the corner with all the world’s ego, and do not talk to everyone the way I do. Later in life, the same person tried to convince me that I should stop doing anything related to community as that will not help me to make any money.

Sorry, but they are wrong in that point. I never saw any of my favorite human beings doing that. Does not matter how senior people are, age or experience wise, they always listen to others, talk nicely with everyone. Money is not everything in life. I kept jumping around in PyCon every year, kept clicking photos or talking with complete strangers about their favorite subjects. Those little conversations later become much stronger bonds, I made new friends whom I generally meet only once in a year. But, the community is still welcoming. No one cared to judge me based on how much money I make. We tried to follow the same in dgplug. The IRC channel #dgplug on Freenode is always filled with folks from all across the world. Some are very experienced contributors, some are just starting. But, it is a friendly place, we try to help each other. The motto of Learn yourself, teach others is still very strong among us. We try to break any such stupid limits others try to force on our lives. We dream, we try to enjoying talking about that book someone just finished. We discuss about our favorite food. I will end this post saying one thing again. Do not bound yourself in some non existing limits. Always remember, What a great teacher, failure is (I hope I quoted Master Yoda properly). Not everything we will try in life will be a super successful thing, but we can always try to learn from those incidents. You don’t have to bow down in front of anyone, you can do things you love in your life without asking for others’ permissions.

January 16, 2018 04:17 AM


Techiediaries - Django

Adding the Django CSRF Protection to React Forms

In this tutorial you'll see how you can handle the Django CSRF token in React when using the Axios client or the fetch API. We'll also see how you can add CSRF in forms rendered dynamically with React

More often than not when you are building React/Redux apps with a Django framework you'll need to send POST, PUT, PATCH and DELETE requests (which require a valid CSRF token included in each request) against an API endpoint using an HTTP client library such as Axios or the browser standard fetch API.

CSRF stands for Cross-Site Request Forgery and it's a type of Cross Site Scripting attack that can be sent from a malicious site through a visitor's browser to your server.

Django has a built in protection against CSRF attacks using the CSRF middleware which's included by default with each new project. Here is what Django docs says about the CSRF middleware

The CSRF middleware and template tag provides easy-to-use protection against Cross Site Request Forgeries. This type of attack occurs when a malicious website contains a link, a form button or some JavaScript that is intended to perform some action on your website, using the credentials of a logged-in user who visits the malicious site in their browser. A related type of attack, ‘login CSRF’, where an attacking site tricks a user’s browser into logging into a site with someone else’s credentials, is also covered.--Django docs

Django also provides the { % csrf_token % } tag that you need to include in your templates's forms that use a POST request to protect your application from being exploited using CSRF. Here is how you can use it:

<form action="" method="post">
{ % csrf_token % }
</form>

You don't need to explicetely include it if you are using Django Forms.

Handling CSRF when Using React

When using JavaScript like React you need to find a way to handle CSRF tokens if if you don't want to disable it.

There are many methods you can use depending on the HTTP client you are using but generally you need to read the CSRF token from a Django cookie and send it with any requests to the Django back-end.

Handling CSRF Tokens in React/Axios

For Axios client you have three options:

Here is how you can simply use the CSRF token with Axios without any further configuration:

import axios from 'axios';

axios.defaults.xsrfCookieName = 'csrftoken'
axios.defaults.xsrfHeaderName = 'X-CSRFToken'

Handling CSRF Tokens in React/Fetch

Now let's see how you can do it using the fetch API.

The first step is to get CSRF token which can be retrieved from the Django csrftoken cookie (will be set only if you enabled CSRF protection in Django).

Now from the Django docs you can find out how to get the csrf token from the cookie by using this simple JavaScript function:

function getCookie(name) {
    var cookieValue = null;
    if (document.cookie && document.cookie !== '') {
        var cookies = document.cookie.split(';');
        for (var i = 0; i < cookies.length; i++) {
            var cookie = jQuery.trim(cookies[i]);
            if (cookie.substring(0, name.length + 1) === (name + '=')) {
                cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
                break;
            }
        }
    }
    return cookieValue;
}

You can also find another implementation for this function from Github.

Now you can retrieve the CSRF token by calling the getCookie('csrftoken') function

var csrftoken = getCookie('csrftoken');

Next you can use this csrf token when sending a request with fetch() by assigning the retrieved token to the X-CSRFToken header.

  fetch(url, {
    credentials: 'include',
    method: 'POST',
    mode: 'same-origin',
    headers: {
      'Accept': 'application/json',
      'Content-Type': 'application/json',
      'X-CSRFToken': csrftoken
    },
    body: {}
   })
  }

Rendering the CSRF Token in React Forms

If you are using React to render forms instead of Django templates you also need to render the csrf token because the Django tag { % csrf_token % } is not available at the client side so you need to create a higher order component that retrieves the token using the getCookie() function and render it in any form.

So first start by creating a HOC in csrftoken.js

import React from 'react';

var csrftoken = getCookie('csrftoken');

const CSRFToken = () => {
    return (
        <input type="hidden" name="csrfmiddlewaretoken" value={csrftoken} />
    );
};
export default CSRFToken;

Then you can simply import it and call it inside your form

import React, { Component , PropTypes} from 'react';

import CSRFToken from './csrftoken';


class aForm extends Component {
    render() {

        return (
                 <form action="/endpoint" method="post">
                        <CSRFToken />
                        <button type="submit">Send</button>
                 </form>
        );
    }
}

export default aForm;

The Django CSRF Cookie

React renders components dynamically that's why Django might not be able to set a CSRF token cookie if you are rendering your form with React. This how Django docs says about that:

If your view is not rendering a template containing the csrftoken template tag, Django might not set the CSRF token cookie. This is common in cases where forms are dynamically added to the page. To address this case, Django provides a view decorator which forces setting of the cookie: ensurecsrf_cookie().

To solve this issue Django provides the ensurecsrfcookie decorator that you need to add to your view function. For example:

from django.views.decorators.csrf import ensure_csrf_cookie

@ensure_csrf_cookie
def myview(request):
    #...

Handling CSRF Using React/Redux

If you are using Redux to manage your application state you can use this module to handle CSRF token in Redux.

You can use by first installing it from npm with

npm install redux-csrf --save

Then you can use the setCsrfToken(token) API that set the CSRF token in the Redux store.

Conclusion

The built in CSRF protection provided by Django is very useful to protect your server from malicious websites that can exploit your visitor browser to attack you but when using modern JavaScript libraries you will need to handle CSRF differently. In this article we have seen different ways to handle CSRF in React apps instead of disabling it.

January 16, 2018 12:00 AM

January 15, 2018


PyCharm

Announcing the MicroPython Plugin for PyCharm

Today we’ve released the MicroPython plugin 1.0 for PyCharm. This plugin lets you edit your MicroPython code and interact with your MicroPython-powered microcontrollers using PyCharm. It supports ESP8266, Pyboard, and BBC Micro:bit devices. The plugin is being developed as a team project by the PyCharm Community lead Andrey Vlasovskikh. The source code for the project can be found on GitHub.

MicroPython is a relatively new member of the Python interpreters family. It’s basically a Python 3.5 implementation designed for microcontrollers — small computing devices that are used everywhere from smart watches to cars. People usually program microcontrollers in C or an assembly language due to low performance and memory limits. Thanks to clever optimization techniques implemented in MicroPython you can now use (almost) standard Python for microcontrollers. For example, you can create your own Internet of Things device and program it in MicroPython.

The MicroPython plugin is compatible with both PyCharm Community and Professional editions. We’re going to make it available for IntelliJ IDEA soon as well. Let me walk you through the setup process and the features of the plugin using PyCharm:

We’ll be using an ESP8266-based device called WEMOS D1 mini. Basically, it’s a Wi-Fi chip with a couple of digital and analog I/O pins to connect external sensors and actuators. But for our simple demo, we won’t need anything besides the LED light that is already located on the device and is connected to the digital output pin 2.

This is our demo program which toggles the LED every second:

import utime
from machine import Pin


def main():
    led = Pin(2, Pin.OUT)
    enabled = False
    while True:
        if enabled:
            led.off()
        else:
            led.on()
        utime.sleep_ms(1000)
        enabled = not enabled


if __name__ == '__main__':
    main()

Let’s get started setting up our device!

First of all, make sure your OS can see your device via USB. This step is device-dependent. For WEMOS D1 mini on Windows and macOS you’ll need a serial port driver provided by the device vendor.

Next, we’ll setup PyCharm to work with your device. First, you need to install the MicroPython plugin in PyCharm “File | Settings | Plugins”. Then you need to create a new Python project in “File | New Project…”. In PyCharm 2017.3 the new project dialog with the correct settings will look like this:

image6

Make sure you’ve configured a Python 3.5 or a newer interpreter for it (preferably a virtual environment), since the MicroPython plugin will later ask you to install a few Python packages to communicate with your device.

After that add a new file to your new project with the contents of our program above. Finally, enable MicroPython support for your project in “File | Settings | Languages & Frameworks | MicroPython” and specify your device there:

MicroPython Configurable

Now let’s see what the plugin has to offer.

Code Completion and Documentation

The MicroPython plugin provides code completion and documentation for MicroPython-specific modules:

MicroPython Code Completion

Notice that code completion is context-aware. On this screenshot PyCharm shows you only the members of the utime module.

The quick documentation window contains the docs for the selected name. Use Ctrl+Q (F1 on macOS) to show this pop-up window. You can also dock it and enable “Auto-update from Source” to keep it permanently.

Syntax Checking and Type Checking

The plugin searches syntax errors and other problems in your code like potential AttributeError or ImportError using static code analysis. It comes with Python stub files for device-specific binary modules. These stubs contain Python type hints that make it possible to check types in your MicroPython code:

MicroPython Type Checking

On the screenshot above you can see several Python syntax errors when the user tries to write some C code in the middle of their Python file. There is also a type error in utime.sleep_ms(3.14), since this function only accepts integers.

Flash Files to Devices

The MicroPython plugin helps you to upload your files to your MicroPython device via USB. Use “MicroPython” run configurations to flash files or folders to your device in “Run | Edit Configurations…” menu. To quickly upload a single file you can select “Run ‘Flash <your-file-name>.py’” from the context menu of your Python file:

MicroPython Run Configuration

MicroPython REPL

Interactive experiments play an important role in Python development, but they are even more important with microcontrollers which usually don’t have any screens to show possible errors. The MicroPython plugin allows you to quickly run an interactive Python REPL console. Use “Tools | MicroPython | MicroPython REPL” menu to run a MicroPython shell on your device.

MicroPython REPL

I hope you enjoy this plugin. I’ll be glad to hear your feedback and how you’re using it. Tell me about your experience with it in the comments below, or on twitter: @vlasovskikh or @pycharm. Star or fork the intellij-micropython repository on GitHub, send your issues and pull requests!

January 15, 2018 04:11 PM


Doug Hellmann

timeit — Time the execution of small bits of Python code. — PyMOTW 3

The timeit module provides a simple interface for determining the execution time of small bits of Python code. It uses a platform-specific time function to provide the most accurate time calculation possible and reduces the impact of start-up or shutdown costs on the time calculation by executing the code repeatedly. Read more… This post is …

January 15, 2018 02:00 PM


Import Python

#158: VS Code and Jupyter Notebook, K-Means Clustering, Markov Chain and more

Worthy Read

Do you use PyCharm as your Python IDE?. Then this course might be of interest to you. Taught by Michael of TalkPython podcast fame.
pycharm

I love VS Code and I love Jupyter Notebooks. Both excel at their own world. But to improve my workflow I had to create a bridge between their worlds.
jupyter
,
visualstudio

GoCD is a continuous delivery tool supporting modern infrastructure with elastic on-demand agents and cloud deployments. With GoCD, you can easily model, orchestrate and visualize complex workflows from end to end. It’s open source, free to use and download.
advert

The O’Reilly Programming Podcast: A look at some of Python’s valuable, but often overlooked, features.
podcast

It provides a management GUI, a slew of scientifically oriented work environments, and tools to simplify the process of using Python for data crunching
anaconda

Ensembles have rapidly become one of the hottest and most popular methods in applied machine learning. Virtually every winning Kaggle solution features them, and many data science pipelines have ensembles in them. Put simply, ensembles combine predictions from different models to generate a final prediction, and the more models we include the better it performs. Better still, because ensembles combine baseline predictions, they perform at least as well as the best baseline model. Ensembles give us a performance boost almost for free!
machine learning

Embed docs directly on your website with a few lines of code. Test the API for free.
advert

We present Skan (Skeleton analysis), a Python library for the analysis of the skeleton structures of objects. It was inspired by the “analyse skeletons” plugin for the Fiji image analysis software, but its extensive Application Programming Interface (API) allows users to examine and manipulate any intermediate data structures produced during the analysis. Further, its use of common Python data structures such as SciPy sparse matrices and pandas data frames opens the results to analysis within the extensive ecosystem of scientific libraries available in Python.
scipy

K-means clustering is a simple yet very effective unsupervised machine learning algorithm for data clustering. It clusters data based on the Euclidean distance between data points. K-means clustering algorithm has many uses for grouping text documents, images, videos, and much more.
machine learning
,
scikit

This is an early developer preview of Python 3.7
new release

Renko charts are time independent and are efficient to trade as they eliminate noise. In this article we see how to plot renko charts of any instrument with OHLC data using Python.
renko

This tutorial will walk through using Google Cloud Speech API to transcribe a large audio file.
audio

Essential codes for jump-starting machine learning/data science with Python.
data science
,
machine learning

Additive models for time series modeling
numpy
,
time series

When it comes to natural language generation, people normally think of advanced AI systems using advanced mathematics; however, that is not always true. In this post, I will be using the idea of Markov chains and a small dataset of quotes to generate new quotes.
markov chain

In this lesson, you will be introduced to Python generators. You will see how a generator can replace a common function and learn the benefits of doing so. You will learn what role the yield keyword provides in functions and how it differs from a return. Building on that knowledge, you will learn how to build a generator to recursively crawl an API (swapi.co) and return Star Wars characters from "The Force Awakens".
generators

core-python


Projects

like-me - 87 Stars, 0 Fork
When no one will follow you, you can do it yourself.

pipenvlib - 67 Stars, 7 Fork
A library for manipulating Pipenv projects.

ftfy-web - 45 Stars, 2 Fork
Paste in some broken unicode text and FTFY will tell you how to fix it!

crypto_lamp - 29 Stars, 5 Fork
A python script for smart lightbulbs to indicate how badly you're losing money

ketchum - 18 Stars, 1 Fork
Use word vectors to interactively generate lists of similar words

pipenv-sublime - 12 Stars, 0 Fork
A Sublime plugin for Pipenv.

moments_models - 12 Stars, 2 Fork
The pretrained models trained on Moments in Time Dataset

pyorphan - 5 Stars, 1 Fork
PyOrphan show suggestion of unused code in your python project.

python-deduckt - 4 Stars, 0 Fork
Runtime type inference for Python

watch-plz - 4 Stars, 1 Fork
Ensure all of your repositories are watched.

January 15, 2018 01:56 PM


Mike Driscoll

PyDev of the Week: Christy Heaton

This week we welcome Christy Heaton (@christytoes) as our PyDev of the Week! Christy is a blogger for the Python Software Foundation. You can see what she’s up to via her Github page or by checking out her website. Let’s take some time to get to know her better!

Can you tell us a little about yourself (hobbies, education, etc):

I studied Anthropology and later Geographic Information Systems (GIS). GIS was the perfect field for me because it brought together my interest in people, technology, and mapping. I now work as a GIS project manager and am a GIS and Python instructor at the University of Washington. In terms of hobbies, I love bringing people together with common interests which is why I help to organize PyLadies Seattle and Maptime Seattle. I’m also a blogger for the Python Software Foundation.

Why did you start using Python?

While starting my career as a GIS Analyst, I was interested in making my work easier, faster, more accurate, and extending the functionality of my mapping software. My GIS application had an Python API built right in, so it was really the only choice in terms of a programming language to use. As I started to incorporate scripting into my workflows, and impressing my boss, I began to realize the full potential of Python, and now I use it for all kinds of things from GIS workflows to testing web services to building web applications.

What other programming languages do you know and which is your favorite?

I use SQL a lot and have dabbled in R and JavaScript. But Python is my favorite by far.

What projects are you working on now?

I use Python at work for testing and automation. I am also working on curriculum development for the Certificate in Python Programming for the University of Washington.

Which Python libraries are your favorite (core or 3rd party)?

I really like requests. It makes the things I want to do related to checking services or web scraping so easy and clean. Jupyter notebooks are another of my favorites. I have found them extremely useful for testing out open source mapping libraries since they allow you to create maps with just a few lines of code, right there in the notebook! GeoPandas is my go-to open source GIS library.

Is there anything else you’d like to say?

Thanks!

Thank you for doing the interview!

January 15, 2018 01:30 PM


Amjith Ramanujam

Python Profiling

I did a presentation at our local Python User Group meeting tonight. It was well received, but shorter than I had expected. I should've added a lot more code examples. 

We talked about usage of cProfile, pstats, runsnakerun and timeit. 

Here are the slides from the presentations: 

The slides were done using latex-beamer, but I wrote the slides in reStructuredText and used rst2beamer to create the tex file which was then converted to pdf using pdflatex. 

The source code for the slides are available on github.

January 15, 2018 11:32 AM

Memoization Decorator

Recently I had the opportunity to give a short 10 min presentation on Memoization Decorator at our local UtahPython Users Group meeting. 

Memoization: 

  • Everytime a function is called, save the results in a cache (map).
  • Next time the function is called with the exact same args, return the value from the cache instead of running the function.

The code for memoization decorator for python is here: http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

Example:

The typical recursive implementation of fibonacci calculation is pretty inefficient O(2^n).   

def fibonacci(num):
        print 'fibonacci(%d)'%num
        if num in (0,1):
            return num
        return fibonacci(num-1) + fibonacci(num-2)

>>> math_funcs.fibonacci(4) # 9 function calls fibonacci(4) fibonacci(3) fibonacci(2) fibonacci(1) fibonacci(0) fibonacci(1) fibonacci(2) fibonacci(1) fibonacci(0) 3

But the memoized version makes it ridiculously efficient O(n) with very little effort.

import memoized
@memoized
def fibonacci(num):
    print 'fibonacci(%d)'%num
    if num in (0,1):
        return num
    return fibonacci(num-1) + fibonacci(num-2)
    
>>> math_funcs.mfibonacci(4)  # 5 function calls
    fibonacci(4)
    fibonacci(3)
    fibonacci(2)
    fibonacci(1)
    fibonacci(0)
    3

We just converted an algorithm from Exponential Complexity to Linear Complexity by simply adding the memoization decorator.

Slides:

Presentation:

I generated the slides using LaTeX Beamer. But instead of writing raw LaTeX code I used reStructured Text (rst) and used rst2beamer script to generate the .tex file. 

Source:

The rst file and tex files are available in Github.

https://github.com/amjith/User-Group-Presentations/tree/master/memoization_de...

 

January 15, 2018 11:31 AM


Semaphore Community

Writing, Testing, and Deploying a Django API to Heroku with Semaphore

This article is brought with ❤ to you by Semaphore.

Introduction

In this tutorial, you will learn how to write and deploy a Django API to Heroku using Semaphore. You'll also learn how to run Django tests on Semaphore, and how to use Heroku pipelines and Heroku review apps with Semaphore. If you'd like to use Semaphore to deploy to other platforms, you can find guides to setting up automatic and manual deployment in Semaphore documentation.

The API we will build is a very simple movie API, which will have CRUD routes for operations on movies.

A movie object will have this very simple representation:

{
  "name": "A movie",
  "year_of_release": 2012
}

The routes we will be implementing are:

\movies - GET & POST
\movies\<pk> - GET, PUT & DELETE

To keep it simple, we won't implement any authentication in our API.

Prerequisites

To follow this tutorial, you need to have the following installed on your machine:

You'll also need to have Github, Semaphore, and Heroku accounts.

Note: we won't cover how to use Git or Github in this tutorial. For more information on that, this is a good place to start.

Setting Up the Environment

Create a Github repo. We'll name it movies-api. Make sure to add a Python .gitignore before clicking Create Repository.

After that, clone it to your local machine and cd into it.

Once inside the movies-api directory, we are going to create a few branches.

git branch staging && git push --set-upstream origin staging
git branch develop && git push --set-upstream origin develop
git checkout -b ft-api && git push --set-upstream origin ft-api

The last command that created the ft-api branch also moved us to it. We should now be on the ft-api branch, ready to start.

Now let's create a Python virtual environment and install the dependencies we need.

python3 -m venv venv

That command creates an environment called venv which is already ignored in our .gitignore.

Next, start the environment.

source venv/bin/activate

After that, we'll install the libraries we will be using, Django and Django Rest Framework for the API.

pip install django djangorestframework gunicorn

Create the requirements.txt file.

pip freeze requirements.txt

Next, let's create our Django project, and simply name it movies.

django-admin startproject movies

cd into movies and create an application called api.

./manage.py startapp api

That's it for the setup. For your reference, we'll be using Django v1.11 and Django Rest Framework v3.6.3.

Writing tests

If you inspect the directory structure of movies-api, you should see something resembling this:

movies-api/\
├── movies\
├──├── api\
├──├──├── migrations\
├──├──├── \_\_init__.py\
├──├──├── admin.py\
├──├──├── apps.py\
├──├──├── models.py\
├──├──├── tests.py\
├──├──├── views.py\
├──├── movies\
├──├──├── \_\_init__.py\
├──├──├── settings.py\
├──├──├── urls.py\
├──├──├── wsgi.py\
├──├── manage.py\
├── venv\
├──.gitignore\
├──LICENSE

We shall be working mostly in the first movies inner folder, where manage.py is located. If you are not in it, cd into it now.

Firstly, we'll register all the applications we introduced under INSTALLED_APPS in settings.py.

# movies-api/movies/movies/settings.py

INSTALLED_APPS = [
    ...
    'rest_framework', # add this
    'api' # add this
]

In the api application folder, create the files urls.py and serializers.py.

Also, delete the tests.py file and create a tests folder. Inside the tests folder, create test_models.py and test_views.py. Make sure to add an __init__.py file as well.

Once done, your api folder should have the following structure:

api/\
├── migrations/\
├── tests/\
├──├── \_\_init_\_.py\
├──├── test_views.py\
├── \_\_init_\_.py\
├── admin.py\
├── apps.py\
├── models.py\
├── serializers.py\
├── urls.py\
├── views.py

Let's add the tests for the movie model we'll write inside test_models.py.

# movies-api/movies/api/tests/test_models.py

from django.test import TestCase

from api.models import Movie


class TestMovieModel(TestCase):
    def setUp(self):
        self.movie = Movie(name="Split", year_of_release=2016)
        self.movie.save()

    def test_movie_creation(self):
        self.assertEqual(Movie.objects.count(), 1)

    def test_movie_representation(self):
        self.assertEqual(self.movie.name, str(self.movie))

The model tests simply create a Movie record in the setUp method. We then test that the movie was saved successfully to the database.

We also test that the string representation of the movie is its name.

We shall add the tests for the views which will be handling our API requests inside of test_views.py.

# movies-api/movies/api/tests/test_views.py

from django.shortcuts import reverse

from rest_framework.test import APITestCase

from api.models import Movie


class TestNoteApi(APITestCase):
    def setUp(self):
        # create movie
        self.movie = Movie(name="The Space Between Us", year_of_release=2017)
        self.movie.save()

    def test_movie_creation(self):
        response = self.client.post(reverse('movies'), {
            'name': 'Bee Movie',
            'year_of_release': 2007
        })

        # assert new movie was added
        self.assertEqual(Movie.objects.count(), 2)

        # assert a created status code was returned
        self.assertEqual(201, response.status_code)

    def test_getting_movies(self):
        response = self.client.get(reverse('movies'), format="json")
        self.assertEqual(len(response.data), 1)

    def test_updating_movie(self):
        response = self.client.put(reverse('detail', kwargs={'pk': 1}), {
            'name': 'The Space Between Us updated',
            'year_of_release': 2017
        }, format="json")

        # check info returned has the update
        self.assertEqual('The Space Between Us updated', response.data['name'])

    def test_deleting_movie(self):
        response = self.client.delete(reverse('detail', kwargs={'pk': 1}))

        self.assertEqual(204, response.status_code)

For the views, we have four main test cases. We test that the a POST to movies/ creates a movie record successfully. We also test that a GET to movies/ returns the correct result. Lastly, we test that PUT and DELETE to movies/<pk> return correct data and status codes.

You can run the tests using manage.py:

python manage.py test

You should see a lot of errors, 6 to be exact. Don't worry, we'll be fixing them in the following sections, in a TDD manner.

Defining the Routes

Let's define the URLs for the API.

We are going to start by editing movies-api/movies/movies/urls.py to look as follows:

# movies-api/movies/movies/urls.py

...
from django.conf.urls import url, include # add include as an import here
from django.contrib import admin

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^api/v1/', include('api.urls')) # add this line
]

The modifications are to tell Django that any request starting with api/v1 should be routed to the api application and they will be handled there.

Now let's go the urls.py you created inside the api application folder and add this to it:

# movies-api/movies/api/urls.py

from django.conf.urls import url

from api.views import MovieCreateView, MovieDetailView

urlpatterns = [
    url(r'^movies/$', MovieCreateView.as_view(), name='movies'),
    url(r'^movies/(?P<id>[0-9]+)$', MovieDetailView.as_view(), name='detail'),
]

Simply put, we have defined two forms of URLs; api/v1/movies/ which will use the MovieCreateView view, and api/v1/movies/<pk> which will use the MovieDetailView view.

The next section will focus on building the movie models & views.

Building the Views

Let's start with the model definition in models.py.

We are going to be storing only the movie's name and year_of_release. Our very simple model should look something like this:

# movies-api/movies/api/models.py

from django.db import models

class Movie(models.Model):
    name = models.CharField(max_length=100)
    year_of_release = models.PositiveSmallIntegerField()

    def __str__(self):
        return self.name

Once you have created the model, go to your terminal and make new migrations:

./manage.py makemigrations

Then, run the migrations:

./manage.py migrate

Running the tests at this point using ./manage.py test should result in only 4 errors since the 2 tests we wrote for the model are now satisfied.

Let's now move to the views. We will first need to create the serializer for the model in serializers.py. Django Rest Framework will use that serializer when serializing Django querysets to JSON.

# movies-api/movies/api/serializers.py

from rest_framework.serializers import ModelSerializer

from api.models import Movie

class MovieSerializer(ModelSerializer):
    class Meta:
        model = Movie
        fields = ('id', 'name', 'year_of_release')
        extra_kwargs = {
            'id': {'read_only': True}
        }

We are using Rest Framework's ModelSerializer. We pass our Movie model to it and specify the fields we would like to be serialized.

We also specify that id will be read only because it is system generated, and not required when creating new records.

Let's finish by defining the views inside views.py. We will be using Rest Framework's generic views.

# movies-api/movies/api/views.py

from rest_framework.generics import ListCreateAPIView, RetrieveUpdateDestroyAPIView

from api.models import Movie
from api.serializers import MovieSerializer


class MovieCreateView(ListCreateAPIView):
    queryset = Movie.objects.all()
    serializer_class = MovieSerializer

class MovieDetailView(RetrieveUpdateDestroyAPIView):
    queryset = Movie.objects.all()
    serializer_class = MovieSerializer

In short, we are using ListCreateAPIView to allow GET and POST and RetrieveUpdateDestroyAPIView to allow GET, PUT and DELETE.

The queryset defines how the view should access objects from the database. The serializer_class attribute defines which serializer the view should use.

At this point, our API is complete. If you run the test cases, you should see 6 successful test cases.

Create Repo

You can also run ./manage.py runserver and point your browser to http://localhost/8000/api/v1/movies to play with Django Rest Framework's web browsable API.

Web Browsable API Interface

Lastly, we need to make sure that our code is deployable to Heroku.

Create a file called Procfile in the root of your application i.e in the movies-api folder. Inside it, add this:

web: gunicorn movies.wsgi --pythonpath=movies --log-file -

Make sure all your code is committed and pushed to Github on the ft-api branch.

Running Tests on Semaphore

First, sign up for a free Semaphore account if you don’t have one already.

Log in to your Semaphore account then click Add new project.

Add new project

We'll add the project from GitHub, but Semaphore supports Bitbucket as well.

After that, select the repository from the list presented, and then select the branch ft-api.

Once the analysis is complete, you will see an outline of the build plan. We'll customize it to look like this:

Build plan

Note that we're using Python v3.6 here, and our Job commands are cd movies && python manage.py test.

After that, scroll down and click Build with these settings.

Your tests should run and pass successfully.

After that, go to Github and merge ft-api into the develop branch. Delete the ft-api branch. Then merge develop into staging, and then staging into master.

At this point, you should have the develop, staging and master branches with similar up to date code and no ft-api branch.

Go to your movies-api project page on Semaphore and click the + button to see a list of your available branches.

Build plan1

Then, select each and run builds for them.

Build plan2

You should now have 3 successful builds for those branches.

Build plan3

Deploying to Heroku

Semaphore makes deploying to Heroku very simple. You can read a shorter guide on deploying to Heroku from Semaphore here.

First of all, create two applications in your Heroku account, one for staging and one for production (i.e movie-api-staging & movie-api-prod in our case).

Make sure to disable collectstatic by setting DISABLE_COLLECTSTATIC=1 in the config for both applications.

Disable Collectstatic

Things to note:

  1. You will have to choose different application names from the ones above.
  2. You will have to add the URLs for your two applications into ALLOWED_HOSTS in settings.py, so that Django will allow requests to those URLs.

Edit movies-api/movies/movies/settings.py:

# movies-api/movies/movies/settings.py

...
ALLOWED_HOSTS = ['your-staging-app.herokuapp.com', 'your-production-app.herokuapp.com']

Then, push your changes to Github and update your branches acccordingly.

From the movies-api page on your Semaphore account, click Set Up Deployment. Deployment

Select Heroku in the next screen. Pick Heroku

We will be going with the Automatic deployment option. Select Automatic

Next, let's deploy the staging branch. Select Staging

The next page needs your Heroku API key. You can find it under the account page in your Heroku account.

API Key

Once you have entered the API key, you will see a list of your available Heroku applications. We are deploying the staging version so select the application you created for staging.

After that, give your server a name and create it. Server Name

On the next page, click the Edit Server button. Make sure to edit the server deploy commands to look like the following before deploying:

Deployment

Your staging application should now be deployed to your-staging-app-name.herokuapp.com.

On your movies-api project page, you can see the staging server was deployed.

Add Server

Click on the + to add a new server for production, and then follow the same procedure to deploy the master branch to your production application.

Working with Heroku Pipeline and Review App

Go to your Heroku account and create a new Pipeline. Then, attach the Github repo for movies-api to it.

Create Pipeline

Once you've attached the correct Github repo, click Create Pipeline.

In the next page, add the staging application you created to the staging section and the existing production application to the production section.

Add Apps

Next, enable Review Apps by clicking the Enable Review Apps... button.

Review Apps

We are going to use the staging application as the parent, i.e config variables from the staging application will be used for Review Apps.

Review App Parent

The next page contains the configuration options for defining the app.json file that will specify how the review application is to be created.

You can leave it as is and click Commit to Repo to have it committed to Github.

Finally, you can enable review apps to create new apps automatically for every PR or destroy them automatically when they become stale.

Configure Review App

From now on, every PR to staging will spin up a review application automatically. The pipeline will easily enable promoting applications from review to staging, and to production.

Conclusion

In this tutorial, we covered how to write a Django and Django Rest Framework API, and how to test it.

We also covered how to use Semaphore to run Django tests and continuously deploy an application to Heroku.

You can find the code we wrote in this tutorial in this GitHub repository. Feel free to leave any comments or questions you may have in the comment section below.

This article is brought with ❤ to you by Semaphore.

January 15, 2018 09:39 AM


Techiediaries - Django

Building a Fake and JWT Protected REST API with json-server

More often than not when you are building a front-end application with libraries like React, Vue or Angular etc. you'll need to work with a back-end API which may not be ready at that time so you'll have to build a mock API to develop against which can be time consuming. Here comes json-server--a simple Node.js server that allows you to create fully working REST APIs in a matter of minutes without the hassle of installing and configuring a database system and you can even add JWT authentication to your endpoints using jsonwebtoken by adding a few lines of code.

In this tutorial we'll learn by example how to quickly create a REST API and add JWT authentication. We'll also see how to use faker.js to quickly generate fake data for our API.

Requirements

Before you can use json-server you'll need to have a development machine with Node.js and NPM installed. You optionally need to have cURL or Postman installed so you can test your API

You can install Node.js and NPM from the official website.

Installing json-server

Head over to your terminal then run the following command:

npm install -g json-server

Depending on your npm configuration you may need to add sudo before your install command to be able to install packages globally.

You can also install json-server locally by generating a new Node.js module using:

mkdir myproject
cd myproject
npm init

Enter the required details and hit OK to generate a new package.json file in your current folder.

You can then install json-server locally:

npm install json-sever --save

Creating API Endpoints

To create your API endpoint(s) you only need to create a JSON file with your data. For example let's create an API with /products endpoint

Create a file called db.json and add the following content:

{
  "products": [
    {
      "id": 1,
      "name": "Product001",
      "cost": 10.0,
      "quantity": 1000
    },
    {
      "id": 2,
      "name": "Product002",
      "cost": 20.0,
      "quantity": 2000
    {
      "id": 3,
      "name": "Product003",
      "cost": 30.0,
      "quantity": 3000
    },
    {
      "id": 4,
      "name": "Product004",
      "cost": 40.0,
      "quantity": 4000
  ]
}

This file acts as the database for your API.

Now run json-server with:

json-server --watch db.json

That’s all you need to create your API based on the data you have added in db.json. You can now create, read, update and delete products from this server with advanced features, such as pagination, sorting and filtering out of the box, that you can expect from a real API server.

Data pagination

You can query paginated data from your API endpoint by adding a page parameter to your endpoint. For example:

curl -X GET "http://localhost:3000/products?_page=1"

This will send a GET request to read the first page.

Filtering data

You can also add filters to get filtered data by simply appending the filters to your endpoint. For example:

curl -X GET "http://localhost:3000/products?name=Product004&cost=30"

& can be used to compbine multiple filters.

Sorting data

You can return sorted data from your endpoint by using _sort and _order parameters. For example:

curl -X GET "http://localhost:3000/products?_sort=name&order=DESC"

You can find more features by visiting the documentation.

Generate Mock Data

You can either add data to your JSON file manually which can be a tedious task or even better use a tool for automatically generate fake data for json-server which is a more practical approach.

The tool we are going to use is faker.js

Head ove to your terminal and start by installing the package from npm using:

npm install faker

Then create a JavaScript file, you can name it however you want. Let's call generateData.js

var faker = require('faker');

var database = { products: [] };

for (var i=1; i<=1000; i++) {
  database.products.push({
    id: i,
    name: faker.random.words(),
    cost: Math.random()*100,
    quantity: Math.random()*1000
  });
}

console.log(JSON.stringify(database));

We’re are using a for-loop to create 1000 fake products with fake names, costs and quantities.

Now all you need to do is to run this script and output data to your db.json file using:

node generateData.js > db.json

Adding JWT Authentication

Json-server provides many real world API features such as pagination and sorting etc. But in real world scenarios, in most cases you'll also have JWT authentication which is not provided out of the box by json-server but you can easily learn to add it with a few lines of code. So let's see how we can protect our fake API endpoint(s) using the jsonwebtoken package.

First start by installing jsonwebtoken

npm install jsonwebtoken --save 

Next you need to create a server.js file inside your folder then follow the steps:

First you start by requiring the modules you'll need to use including jsonwebtoken and json-server

const fs = require('fs')
const bodyParser = require('body-parser')
const jsonServer = require('json-server')
const jwt = require('jsonwebtoken')

Next use the create() method to return an Express server

const server = jsonServer.create()

Call the router() method to return an Express router

const router = jsonServer.router('./db.json')

Now you need to read and JSON parse the users.json file which you first need to create. This file acts like a table for registered users.

const userdb = JSON.parse(fs.readFileSync('./users.json', 'UTF-8'))

Make sure to create users.json and add some users then save it:

{
    "users": [
      {
        "id": 1,
        "name": "bruno",
        "email": "bruno@email.com",
        "password": "bruno"
      },
      {
        "id": 2,
        "name": "nilson",
        "email": "nilson@email.com",
        "password": "nilson"
      }
    ]
  }

Next, set default middlewares (logger, static, cors and no-cache)

server.use(jsonServer.defaults());

Or you can also add your own settings

server.use(bodyParser.urlencoded({extended: true}))
server.use(bodyParser.json())

Next define some constants: SECRET_KEY is used to sign the payloads and expiresIn for setting the time of expiration for JWT access tokens.

const SECRET_KEY = '123456789'
const expiresIn = '1h'

Add the following functions:

// Create a token from a payload 
function createToken(payload){
  return jwt.sign(payload, SECRET_KEY, {expiresIn})
}

// Verify the token 
function verifyToken(token){
  return  jwt.verify(token, SECRET_KEY, (err, decode) => decode !== undefined ?  decode : err)
}

// Check if the user exists in database
function isAuthenticated({email, password}){
  return userdb.users.findIndex(user => user.email === email && user.password === password) !== -1
}

Now you need to create a POST /auth/login endpoint which verifies if the user exists in the database and then create and send a JWT token to the user:

server.post('/auth/login', (req, res) => {
  const {email, password} = req.body
  if (isAuthenticated({email, password}) === false) {
    const status = 401
    const message = 'Incorrect email or password'
    res.status(status).json({status, message})
    return
  }
  const access_token = createToken({email, password})
  res.status(200).json({access_token})
})

Next add an Express middleware that checks that the authorization header has the Bearer scheme then verifies if the token if valid for all routes except the previous route since this is the one we use to login the users.

server.use(/^(?!\/auth).*$/,  (req, res, next) => {
  if (req.headers.authorization === undefined || req.headers.authorization.split(' ')[0] !== 'Bearer') {
    const status = 401
    const message = 'Bad authorization header'
    res.status(status).json({status, message})
    return
  }
  try {
     verifyToken(req.headers.authorization.split(' ')[1])
     next()
  } catch (err) {
    const status = 401
    const message = 'Error: access_token is not valid'
    res.status(status).json({status, message})
  }
})

Finally mount json-server then run server on port 3000 using:

server.use(router)

server.listen(3000, () => {
  console.log('Run Auth API Server')
})

You can also mount json-server on a specific endpoint (/api) using:

server.use('/api', router;

That's it you now have a protected API. Let's add two npm scripts to run the server

Open your package.json file then add this two scripts

  "scripts": {
    "start": "json-server --watch ./db.json",
    "start-auth": "node server.js"
  },

The start script runs json-server normally without any authentication

The start-auth runs our server.js script

Now head back to your terminal and run:

npm run start-auth

You can find the source code for this example in this Github repository

Conclusion

You are now ready to prototype your front-end web application without worrying about APIs or data. You can also add JWT authentication to your mock API endpoints to simulate more real world scenarios. Have fun!

January 15, 2018 12:00 AM

January 14, 2018


Catalin George Festila

The trinket website for learning.

This website come with this feature:
Trinket lets you run and write code in any browser, on any device.
Trinkets work instantly, with no need to log in, download plugins, or install software.
Easily share or embed the code with your changes when you're done.

January 14, 2018 09:08 AM


Bhishan Bhandari

Automated chat using python – Automation using selenium and python

Putting two clever bots into conversation Keeping the promise to come up with a nice article, I present you two bots into conversation. This week I’ve bridged two clever bot’s for a nice conversation. Well starting with the background for this article, I had an assignment to print out a conversation of mine with cleverbot. […]

The post Automated chat using python – Automation using selenium and python appeared first on The Tara Nights.

January 14, 2018 05:55 AM


Weekly Python StackOverflow Report

(cviii) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2018-01-14 05:01:04 GMT


  1. Printing without parentheses varying error message using Python 3 - [21/4]
  2. How do you generalise the creation of a list with many variables and conditions of `if`? - [12/4]
  3. isclose function in numpy is different from math - [12/3]
  4. Explode/stack a Series of strings - [8/6]
  5. Do sometimes python IDLE interactive shell make a mistake? - [8/0]
  6. Why is the dictionary key being converted to an inherited class type? - [7/2]
  7. Pandas: Knowing when an operation affects the original dataframe - [7/0]
  8. Inserting list values from a list to another in a specific order in Python - [6/6]
  9. How to sort a list and handle None values properly? - [6/3]
  10. Convert list of ordered dict to nested lists - [6/2]

January 14, 2018 05:02 AM