skip to navigation
skip to content

Planet Python

Last update: January 15, 2021 04:46 PM UTC

January 15, 2021


Python Pool

Unboxing the Python Tempfile Module

Hello geeks and welcome in this article, we will cover Python Tempfile(). Along with that, for an overall better understanding, we will also look at its syntax and parameter. Then we will see the application of all the theory part through a couple of examples. The Tempfile is a standard library used to create temporary files and directories. These kinds of files come really handy when we don’t wish to store data permanently. If we are working with massive data, these files are created with unique names and stored at a default location, varying from os to os. For instance, in windows, the temp folder resides in profile/AppData/Local/Temp while different for other cases.

Creating a Temporary File

import tempfile 
  
file = tempfile.TemporaryFile() 
print(file) 
print(file.name)

Output:

<_io.BufferedRandom name=3>
3

Here, we can see how to create a temporary file using python tempfile(). At first, we have imported the tempfile module, following which we have defined a variable and used our function to create a tempfile. After which, we have used the print statement 2 times 1st to get our file and 2nd to exactly get ourselves the exact filename. The filename is randomly generated and may vary from user to user.

Creating a Named Temporary File

import tempfile

file = tempfile.NamedTemporaryFile()
print(file)
print(file.name)

Output:

<tempfile._TemporaryFileWrapper object at 0x000002756CC7DC40>
C:\Users\KIIT\AppData\Local\Temp\tmpgnp482wy

Here we have to create a named temporary file. The only difference, which is quite evident, is that instead of, Temporary file, we have used NamedTemporaryfile. A random file name is allotted, but it is clearly visible, unlike the previous case. Another thing that can be verified here is the structure profile/AppData/Local/Temp(as mentioned for windows os). As if now, we have seen how to create a temporary file and a named temporary file.

Creating a Temporary Directory

import tempfile
dir = tempfile.TemporaryDirectory() 
print(dir)

Here above we have created a directory. A directory can be defined as a file system structure that contains the location for all the other computer files. Here we can see that there’s just a minute change in syntax when compared to what we were using for creating a temporary file. Here just instead of TemporaryFile, we have used TemporaryDirectory.

Reading and Writing to a Temporary File

import tempfile 
  
file = tempfile.TemporaryFile() 
file.write(b'WELCOME TO PYTHON PPOOL') 
file.seek(0) 
print(file.read()) 
  
file.close()

Output:

b'WELCOME TO PYTHON PPOOL'

Above we can see how to read and write in the temporary files. Here we have first created a temporary file. Following which we have used the write function which is used to write data in a temporary file. You must be wondering what is ‘b’ doing there. The fact is that the temporary files take input by default so ‘b’ out there converts the string into binary. Next, the seek function is used to called to set the file pointer at the starting of the file. Then we have used a read function that reads the content of the temporary file.

Alternative to Python tempfile()

Python tempfile() is great but in this section, we will look at one of the alternatives to it. mkstemp() is a function that does all things you can do with the tempfile but in addition to that, it provides security. Only the user that created that has created the temp file can add data to it. Further, this file doesn’t automatically get deleted when closed.

import tempfile 
   
sec_file = tempfile.mkstemp() 
print(sec_file)

Output:

(3, '/tmp/tmp87gc2pz0')

Here we can how to create a temporary file using mkstemp(). Not much syntax change can be seen here only; instead of TemporaryFile(), we have used mkstemp(), and the rest of everything is the same.

General FAQ’s regarding python tempfile()

1. How to find the path of python tempfile()?

Ans. To get the path of a python tempfile(), you need to create a named tempfile, which has already been discussed. In that case, you get the exact path of the tempfile, as in our case, we get "C:\Users\KIIT\AppData\Local\Temp\tmpgnp482wy".

2. How to perform cleanup for python tempfile()?

Ans. Python itself deletes the tempfile once they are closed.

3. How to create secure python tempfile()?

Ans. In order to create a secure tempfile() we can use the mkstemp() function. As has been discussed in detail already. We know the tempfile created using this can only be edited by creating it. Your permission is also required for someone else to access it.

4. What is the name for python tempfile() ?

Ans. If you create a temporary file, then it has no name, as discussed above. Whereas when you create a Named tempfile, a random name is allocated to it, visible in its path.

Conclusion

In this article, we covered the Python tempfile(). Besides that, we have also looked at creating a temporary file, temporary directory, how to read and write in a temp file, and looked at an alternative called mkstemp(). Here we can conclude that this function helps us create a Temporary file in python.

I hope this article was able to clear all doubts. But in case you have any unsolved queries feel free to write them below in the comment section. Done reading this, why not read about the argpartition function next.

The post Unboxing the Python Tempfile Module appeared first on Python Pool.

January 15, 2021 03:42 PM UTC


Lucas Cimon

Adding content to existing PDFs with fpdf2

fpdf2, the library I mentioned in my previous post, cannot parse existing PDF files.

However, other Python libraries can be combined with fpdf2 in order to add new content to existing PDF files.

This page provides several examples of doing so using pdfrw, a great zero-dependency pure Python library dedicated …

Permalink

January 15, 2021 01:11 PM UTC


Real Python

The Real Python Podcast – Episode #43: Deep Reinforcement Learning in a Notebook With Jupylet + Gaming and Synthesis

What is it like to design a Python library for three different audiences? This week on the show, we have Nir Aides, creator of Jupylet. His new library is designed for deep reinforcement learning researchers, musicians interested in live music coding, and kids interested in learning to program. Everything is designed to run inside of a Jupyter notebook.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

January 15, 2021 12:00 PM UTC


Zato Blog

New REST programming examples

As we are preparing to release Zato 3.2 soon, all the programming examples are being rewritten to showcase what the platform is capable of. That includes REST examples too and this article presents a few samples taken from the documentation.

For a fuller discussion and more examples - check the documentation.

Calling REST APIs

    # -*- coding: utf-8 -*-

    # Zato
    from zato.server.service import Service

    class SetBillingInfo(Service):
        """ Updates billing information for customer.
        """
        def handle(self):

        # Python dict representing the payload we want to send across
        payload = {'billing':'395.7', 'currency':'EUR'}

        # Python dict with all the query parameters, including path and query string
        params = {'cust_id':'39175', 'phone_no':'271637517', 'priority':'normal'}

        # Headers the endpoint expects
        headers = {'X-App-Name': 'Zato', 'X-Environment':'Production'}

        # Obtains a connection object
        conn = self.out.rest['Billing'].conn

        # Invoke the resource providing all the information on input
        response = conn.post(self.cid, payload, params, headers=headers)

        # The response is auto-deserialised for us to a Python dict
        json_dict = response.data

        # Assign the returned dict to our response - Zato will serialise it to JSON
        # and our caller will get a JSON message from us.
        self.response.payload = json_dict

Accepting REST calls

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class LogInputData(Service):
    """ Logs input data.
    """
    def handle(self):

        # Read input received
        user_id = self.request.payload['user_id']
        user_name = self.request.payload['user_name']

        # Store input in logs
        self.logger.info('uid:%s; username:%s', user_id, user_name)

Reacting to REST verbs

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class MultiVerb(Service):
    """ Logs input data.
    """
    def handle_GET(self):
        self.logger.info('I was invoked via GET')

    def handle_POST(self):
        self.logger.info('I was invoked via POST')

Request and response objects

Request object:

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class RequestObject(Service):

    def handle(self):

        # Here is all input data parsed to a Python object
        self.request.payload

        # Here is input data before parsing, as a string
        self.request.raw_request

        # Correlation ID - a unique ID assigned to this request
        self.request.cid

        # A dictionary of GET parameters
        self.request.http.GET

        # A dictionary of POST parameters
        self.request.http.POST

        # REST method we are invoked with, e.g. GET, POST, PATCH etc.
        self.request.http.method

        # URL path the service was invoked through
        self.request.http.path

        # Query string and path parameters
        self.request.http.params

        # This is a method, not an attribute,
        # it will return form data in case we were invoked with one on input.
        form_data = self.request.http.get_form_data()

        # Username used to invoke the service, if any
        self.channel.security.username

        # A convenience method returning security-related details
        # pertaining to this request.
        sec_info = self.channel.security.to_dict()

Response object:

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class ResponseObject(Service):

    # Returning responses as a dict will make Zato serialise it to JSON
    self.response.payload = {'user_id': '123', 'user_name': 'my.user'}

    # String data can also be always be returned too,
    # e.g. because you already have data serialised to JSON or to another data format
    self.response.payload = '{"my":"response"}'

    # Sets HTTP status code
    self.response.status_code = 200

    # Sets HTTP Content-Encoding header
    self.response.content_encoding = 'gzip'

    # Sets HTTP Content-Type - note that Zato itself
    # sets it for JSON, you do not need to do it.
    self.response.content_type = 'text/xml; charset=UTF-8'

    # A dictionary of arbitrary HTTP headers to return
    self.response.headers = {
        'Strict-Transport-Security': 'Strict-Transport-Security: max-age=16070400',
        'X-Powered-By': 'My-API-Server',
        'X-My-Header': 'My-Value',
    }

Next steps

This article is just a quick preview and if you are interested in building scalable and reusable API systems, you can start now by visiting the Zato main page, familiarising yourself with the extensive documentation or by going straight to the first part of the tutorial.

Be sure to visit our Twitter, GitHub and Gitter communities too!

January 15, 2021 10:46 AM UTC


Janusworx

Consolidating Websites

Update: 15/01/2021.
It’s done.
I moved everything and it all seems to be working.
If something is broken, let me know, on [the fediverse], or mailing me at jason at this domain.
To my tech posse, no there is no forwarding of old links happening. The site is too small and I have no time. Nobody is going to miss this.
And to that one little friend, who noticed that the site was dead and actually cried, I love you.

Read more… (1 min remaining to read)

January 15, 2021 03:30 AM UTC

January 14, 2021


Codementor

How I learned Django

About me I am Elvan Celik, who has a bachelor's degree in Computer science and a master's degree in Informational Technologies. I am an expert at Python programming language and I have learned...

January 14, 2021 04:39 PM UTC


Stack Abuse

Introduction to Data Visualization in Python with Pandas

Introduction

People can rarely look at a raw data and immediately deduce a data-oriented observation like:

People in stores tend to buy diapers and beer in conjunction!

Or even if you as a data scientist can indeed sight read raw data, your investor or boss most likely can't.

In order for us to properly analyze our data, we need to represent it in a tangible, comprehensive way. Which is exactly why we use data visualization!

The pandas library offers a large array of tools that will help you accomplish this. In this article, we'll go step by step and cover everything you'll need to get started with pandas visualization tools, including bar charts, histograms, area plots, density plots, scatter matrices, and bootstrap plots.

Importing Data

First, we'll need a small dataset to work with and test things out.

I'll use an Indian food dataset since frankly, Indian food is delicious. You can download it for free from Kaggle.com. To import it, we'll use the read_csv() method which returns a DataFrame. Here's a small code snippet, which prints out the first five and the last five entries in our dataset. Let's give it a try:

import pandas as pd
menu = pd.read_csv('indian_food.csv')
print(menu)

Running this code will output:

               name            state      region ...  course
0        Balu shahi      West Bengal        East ... dessert
1            Boondi        Rajasthan        West ... dessert
2    Gajar ka halwa           Punjab       North ... dessert
3            Ghevar        Rajasthan        West ... dessert
4       Gulab jamun      West Bengal        East ... dessert
..              ...              ...         ... ...     ...
250       Til Pitha            Assam  North East ... dessert
251         Bebinca              Goa        West ... dessert
252          Shufta  Jammu & Kashmir       North ... dessert
253       Mawa Bati   Madhya Pradesh     Central ... dessert
254          Pinaca              Goa        West ... dessert

If you want to load data from another file format, pandas offers similar read methods like read_json(). The view is slightly truncated due to the long-form of the ingredients variable.

To extract only a few selected columns, we'll can subset the dataset via square brackets and list column names that we'd like to focus on:

import pandas as pd

menu = pd.read_csv('indian_food.csv')
recepies = menu[['name', 'ingredients']]
print(recepies)

This yields:

               name                                        ingredients
0        Balu shahi                    Maida flour, yogurt, oil, sugar
1            Boondi                            Gram flour, ghee, sugar
2    Gajar ka halwa       Carrots, milk, sugar, ghee, cashews, raisins
3            Ghevar  Flour, ghee, kewra, milk, clarified butter, su...
4       Gulab jamun  Milk powder, plain flour, baking powder, ghee,...
..              ...                                                ...
250       Til Pitha            Glutinous rice, black sesame seeds, gur
251         Bebinca  Coconut milk, egg yolks, clarified butter, all...
252          Shufta  Cottage cheese, dry dates, dried rose petals, ...
253       Mawa Bati  Milk powder, dry fruits, arrowroot powder, all...
254          Pinaca  Brown rice, fennel seeds, grated coconut, blac...

[255 rows x 2 columns]

Plotting Bar Charts with Pandas

The classic bar chart is easy to read and a good place to start - let's visualize how long it takes to cook each dish.

Pandas relies on the Matplotlib engine to display generated plots. So we'll have to import Matplotlib's PyPlot module to call plt.show() after the plots are generated.

First, let's import our data. There's a lot of dishes in our data set - 255 to be exact. This won't really fit into a single figure while staying readable.

We'll use the head() method to extract the first 10 dishes, and extract the variables relevant to our plot. Namely, we'll want to extract the name and cook_time for each dish into a new DataFrame called name_and_time, and truncate that to the first 10 dishes:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')

name_and_time = menu[['name','cook_time']].head(10)

Now we'll use the bar() method to plot our data:

DataFrame.plot.bar(x=None, y=None, **kwargs)

Many additional parameters can be passed to further customize the plot, such as rot for label rotation, legend to add a legend, style, etc...

Many of these arguments have default values, most of which are turned off. Since the rot argument defaults to 90, our labels will be rotated by 90 degrees. Let's change that to 30 while constructing the plot:

name_and_time.plot.bar(x='name',y='cook_time', rot=30)

And finally, we'll call the show() method from the PyPlot instance to display our graph:

plt.show()

This will output our desired bar chart:
bar chart pandas

Plotting Multiple Columns on Bar Plot's X-Axis in Pandas

Oftentimes, we might want to compare two variables in a Bar Plot, such as the cook_time and prep_time. These are both variables corresponding to each dish and are directly comparable.

Let's change the name_and_time DataFrame to also include prep_time:

name_and_time = menu[['name','prep_time','cook_time']].head(10)
name_and_time.plot.bar(x='name', rot=30)

Pandas automatically assumed that the two numerical values alongside name are tied to it, so it's enough to just define the X-axis. When dealing with other DataFrames, this might not be the case.

If you need to explicitly define which other variables should be plotted, you can simply pass in a list:

name_and_time.plot.bar(x='name', y=['prep_time','cook_time'], rot=30)

Running either of these two codes will yield:
multiple bar plots pandas
That's interesting. It seems that the food that's faster to cook takes more prep time and vice versa. Though, this does come from a fairly limited subset of data and this assumption might be wrong for other subsets.

Plotting Stacked Bar Graphs with Pandas

Let's see which dish takes the longest time to make overall. Since we want to factor in both the prep time and cook time, we'll stack them on top of each other.

To do that, we'll set the stacked parameter to True:

name_and_time.plot.bar(x='name', stacked=True)

stacked bar graphs with pandas

Now, we can easily see which dishes take the longest to prepare, factoring in both the prep time and cooking time.

Customizing Bar Plots in Pandas

If we want to make the plots look a bit nicer, we can pass some additional arguments to the bar() method, such as:

If we want a horizontal bar chart, we can use the barh() method which takes the same arguments.

For example, let's plot a horizontal orange and green Bar Plot, with the title "Dishes", with a grid, of size 5 by 6 inches, and a legend:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
name_and_time = menu[['name','cook_time','prep_time']].head()

name_and_time.plot.barh(x='name',color =['orange','green'], title = "Dishes", grid = True, figsize=(5,6), legend = True)
plt.show()

customizing bar plots in pandas

Plotting Histograms with Pandas

Histograms are useful for showing data distribution. Looking at one recipe, we have no idea if the cooking time is close to the mean cooking time, or if it takes a really long amount of time. Means can help us with this, to a degree, but can be misleading or prone to huge error bars.

To get an idea of the distribution, which gives us a lot of information on the cooking time, we'll want to plot a histogram plot.

With Pandas, we can call the hist() function on a DataFrame to generate its histogram:

DataFrame.hist(column=None, by=None, grid=True, xlabelsize=None, xrot=None, ylabelsize=None, yrot=None, ax=None, sharex=False, sharey=False, fcigsize=None, layout=None, bins=10, backend=None, legend=False,**kwargs)

The bins parameter indicates the number of bins to be used.

A big part of working with any dataset is data cleaning and preprocessing. In our case, some foods don't have proper cook and prep times listed (and have a -1 value listed instead).

Let's filter them out of our menu, before visualizing the histogram. This is the most basic type of data pre-processing. In some cases, you might want to change data types (currency formatted strings into floats, for example) or even construct new data points based on some other variable.

Let's filter out invalid values and plot a histogram with 50 bins on the X-axis:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
menu = menu[menu.cook_time != -1] # Filtering
cook_time = menu['cook_time']

cook_time.plot.hist(bins = 50)

plt.legend()
plt.show()

This results in:
histogram plot pandas

On the Y-axis, we can see the frequency of the dishes, while on the X-axis, we can see how long they take to cook.

The higher the bar is, the higher the frequency. According to this histogram, most dishes take between 0..80 minutes to cook. The highest number of them is in the really high bar, though, we can't really make out which number this is exactly because the frequency of our ticks is low (one each 100 minutes).

For now, let's try changing the number of bins to see how that affects our histogram. After that, we can change the frequency of the ticks.

Emphasizing Data with Bin Sizes

Let's try plotting this histogram with 10 bins instead:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
menu = menu[menu.cook_time != -1] # Filtering
cook_time = menu['cook_time']

cook_time.plot.hist(bins = 10)

plt.legend()
plt.show()

changing bins in histogram pandas

Now, we've got 10 bins in the entire X-axis. Note that only 3 bins have some data frequency while the rest is empty.

Now, let's perhaps increase the number of bins:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
menu = menu[menu.cook_time != -1] # Filtering
cook_time = menu['cook_time']

cook_time.plot.hist(bins = 100)

plt.legend()
plt.show()

histogram bins pandas

Now, the bins are awkwardly placed far apart, and we've again lost some information due to this. You'll always want to experiment with the bin sizes and adjust until the data you want to explore is shown nicely.

The default settings (bin number defaults to 10) would've resulted in an odd bin number in this case.

Change Tick Frequency for Pandas Histogram

Since we're using Matplotlib as the engine to show these plots, we can also use any Matplotlib customization techniques.

Since our X-axis ticks are a bit infrequent, we'll make an array of integers, in 20-step increments, between 0 and the cook_time.max(), which returns the entry with the highest number.

Also, since we'll have a lot of ticks in our plot, we'll rotate them by 45-degrees to make sure they fit well:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Clean data and extract what we're looking for
menu = pd.read_csv('indian_food.csv')
menu = menu[menu.cook_time != -1] # Filtering
cook_time = menu['cook_time']

# Construct histogram plot with 50 bins
cook_time.plot.hist(bins=50)

# Modify X-Axis ticks
plt.xticks(np.arange(0, cook_time.max(), 20))
plt.xticks(rotation = 45) 

plt.legend()
plt.show()

This results in:

change tick frequency pandas

Plotting Multiple Histograms

Now let's add the prep time into the mix. To add this histogram, we'll plot it as a separate histogram setting both at 60% opacity.

They will share both the Y-axis and the X-axis, so they'll overlap. Without setting them to be a bit transparent, we might not see the histogram under the second one we plot:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Filtering and cleaning
menu = pd.read_csv('indian_food.csv')
menu = menu[(menu.cook_time!=-1) & (menu.prep_time!=-1)] 

# Extracting relevant data
cook_time = menu['cook_time']
prep_time = menu['prep_time']

# Alpha indicates the opacity from 0..1
prep_time.plot.hist(alpha = 0.6 , bins = 50) 
cook_time.plot.hist(alpha = 0.6, bins = 50)

plt.xticks(np.arange(0, cook_time.max(), 20))
plt.xticks(rotation = 45) 
plt.legend()
plt.show()

This results in:
plotting multiple hsitograms pandas

We can conclude that most dishes can be made in under an hour, or in about an hour. However, there are a few that take a couple of days to prepare, with 10 hour prep times and long cook times.

Customizing Histograms Plots

To customize histograms, we can use the same keyword arguments which we used with the bar plot.

For example, let's make a green and red histogram, with a title, a grid, a legend - the size of 7x7 inches:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

menu = pd.read_csv('indian_food.csv')
menu = menu[(menu.cook_time!=-1) & (menu.prep_time!=-1)] #filltering

cook_time = menu['cook_time']
prep_time = menu['prep_time']

prep_time.plot.hist(alpha = 0.6 , color = 'green', title = 'Cooking time', grid = True, bins = 50)
cook_time.plot.hist(alpha = 0.6, color = 'red', figsize = (7,7), grid = True, bins = 50)

plt.xticks(np.arange(0, cook_time.max(), 20))
plt.xticks(rotation = 45) 

plt.legend()
plt.show()

And here's our Christmas-colored histogram:

customizing histogram pandas

Plotting Area Plots with Pandas

Area Plots are handy when looking at the correlation of two parameters. For example, from the histogram plots, it would be valid to lean towards the idea that food that takes longer to prep, takes less time to cook.

To test this, we'll plot this relationship using the area() function:

DataFrame.plot.area(x=None, y=None, **kwargs)

Let's use the mean of cook times, grouped by prep times to simplify this graph:

time = menu.groupby('prep_time').mean() 

This results in a new DataFrame:

prep_time
5           20.937500
10          40.918367
12          40.000000
15          36.909091
20          36.500000
...
495         40.000000
500        120.000000

Now, we'll plot an area-plot with the resulting time DataFrame:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
menu = menu[(menu.cook_time!=-1) & (menu.prep_time!=-1)]

# Simplifying the graph
time = menu.groupby('prep_time').mean() 
time.plot.area()

plt.legend()
plt.show()

area plot pandas

Here, our notion of the original correlation between prep-time and cook-time has been shattered. Even though other graph types might lead us to some conclusions - there is a sort of correlation implying that with higher prep times, we'll also have higher cook times. Which is the opposite of what we hypothesized.

This is a great reason not to stick only to one graph-type, but rather, explore your dataset with multiple approaches.

Plotting Stacked Area Plots

Area Plots have a very similar set of keyword arguments as bar plots and histograms. One of the notable exceptions would be:

Let's plot out the cooking and prep times so that they are stacked, pink and purple, with a grid, 8x9 inches in size, with a legend:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
menu = menu[(menu.cook_time!=-1) & (menu.prep_time!=-1)]

menu.plot.area()

plt.legend()
plt.show()

stacked area plot pandas

Plotting Pie Charts with Pandas

Pie chars are useful when we have small number of categorical values which we need to compare. They are very clear and to the point, however, be careful. The readability of pie charts goes way down with the slightest increase in the number of categorical values.

To plot pie charts, we'll use the pie() function which has the following syntax:

DataFrame.plot.pie(**kwargs)

Plotting out the flavor profiles:

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')

flavors = menu[menu.flavor_profile != '-1']
flavors['flavor_profile'].value_counts().plot.pie()

plt.legend()
plt.show()

This results in:

pie chart pandas

By far, most dishes are spicy and sweet.

Customizing Pie Charts

To make our pie chart more appealing, we can tweak it with the same keyword arguments we used in all the previous chart alternative, with some novelties being:

To show how this works, let's plot the regions from which the dishes originate. We'll use head() to take only the first 10, as to not have too many slices.

Let's make the pie pink, with the title "States", give it a shadow and a legend and make it start at the angle of 15 :

import pandas as pd
import matplotlib.pyplot as plt

menu = pd.read_csv('indian_food.csv')
states = (menu[menu.state != '-1'])['state'].value_counts().head(10)

# Colors to circle through
colors = ['lightpink','pink','fuchsia','mistyrose','hotpink','deeppink','magenta']

states.plot.pie(colors = colors, shadow = True, startangle = 15, title = "States")

plt.show()

customize pie chart pandas

Plotting Density Plots with Pandas

If you have any experience with statistics, you've probably seen a Density Plot. Density Plots are a visual representation of probability density across a range of values.

A Histogram is a Density Plot, which bins together data points into categories. The second most popular density plot is the KDE (Kernel Density Estimation) plot - in simple terms, it's like a very smooth histogram with an infinite number of bins.

To plot one, we'll use the kde() function:

DataFrame.plot.kde(bw_method=None, ind=None, **kwargs)

For example, we'll plot the cooking time:

import pandas as pd
import matplotlib.pyplot as plt
import scipy

menu = pd.read_csv('indian_food.csv')

time = (menu[menu.cook_time != -1])['cook_time']
time.value_counts().plot.kde()
plt.show()

This distribution looks like this:
density plots with pandas

In the Histogram section, we've struggled to capture all the relevant information and data using bins, because every time we generalize and bin data together - we lose some accuracy.

With KDE plots, we've got the benefit of using an, effectively, infinite number of bins. No data is truncated or lost this way.

Plotting a Scatter Matrix (Pair Plot) in Pandas

A bit more complex way to interpret data is using Scatter Matrices. Which are a way of taking into account the relationship of every pair of parameters. If you've worked with other libraries, this type of plot might be familiar to you as a pair plot.

To plot Scatter Matrix, we'll need to import the scatter_matrix() function from the pandas.plotting module.

The syntax for the scatter_matrix() function is:

pandas.plotting.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds=None, hist_kwds=None, range_padding=0.05, **kwargs)

Since we're plotting pair-wise relationships for multiple classes, on a grid - all the diagonal lines in the grid will be obsolete since it compares the entry with itself. Since this would be dead space, diagonals are replaced with a univariate distribution plot for that class.

The diagonal parameter can be either 'kde' or 'hist' for either Kernel Density Estimation or Histogram plots.

Let's make a Scatter Matrix plot:

import pandas as pd 
import matplotlib.pyplot as plt
import scipy
from pandas.plotting import scatter_matrix

menu = pd.read_csv('indian_food.csv')

scatter_matrix(menu,diagonal='kde')

plt.show()

The plot should look like this:
scatter matrix pandas

Plotting a Bootstrap Plot in Pandas

Pandas also offers a Bootstrap Plot for your plotting needs. A Bootstrap Plot is a plot that calculates a few different statistics with different subsample sizes. Then with the accumulated data on the statistics, it generates the distribution of the statistics themselves.

Using it is as simple as importing the bootstrap_plot() method from the pandas.plotting module. The bootstrap_plot() syntax is:

pandas.plotting.bootstrap_plot(series, fig=None, size=50, samples=500, **kwds)

And finally, let's plot a Bootstrap Plot:

import pandas as pd
import matplotlib.pyplot as plt
import scipy
from pandas.plotting import bootstrap_plot

menu = pd.read_csv('indian_food.csv')

bootstrap_plot(menu['cook_time'])
plt.show()

The bootstrap plot will look something like this:
bootstrap plots in pandas

Conclusion

In this guide, we've gone over the introduction to Data Visualization in Python with Pandas. We've covered basic plots like Pie Charts, Bar Plots, progressed to Density Plots such as Histograms and KDE Plots.

Finally, we've covered Scatter Matrices and Bootstrap Plots.

If you're interested in Data Visualization and don't know where to start, make sure to check out our book on Data Visualization in Python.

Data Visualization in Python, a book for beginner to intermediate Python developers, will guide you through simple data manipulation with Pandas, cover core plotting libraries like Matplotlib and Seaborn, and show you how to take advantage of declarative and experimental libraries like Altair.

Data Visualization in Python

Understand your data better with visualizations! With 340 pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.

January 14, 2021 01:30 PM UTC


Ben Cook

Dropping columns and rows in Pandas

df.drop() The easiest way to drop rows and columns from a Pandas DataFrame is with the .drop() method, which accepts one or more labels passed in as index=<rows to drop> and/or columns=<cols to drop>: import pandas as pd df = pd.read_csv("https://jbencook.com/data/dummy-sales.csv").head() df # Expected result # date region revenue #...

January 14, 2021 08:00 AM UTC


Red Hat Developers

Knowledge meets machine learning for smarter decisions, Part 1

Drools is a popular open source project known for its powerful rules engine. Few users realize that it can also be a gateway to the amazing possibilities of artificial intelligence. This two-part article introduces you to using Red Hat Decision Manager and its Drools-based rules engine to combine machine learning predictions with deterministic reasoning. In Part 1, we’ll prepare our machine learning logic. In Part 2, you’ll learn how to use the machine learning model from a knowledge service.

Note: Examples in this article are based on Red Hat Decision Manager, but all of the technologies used are open source.

Machine learning meets knowledge engineering

Few Red Hat Decision Manager users know about its roots in artificial intelligence (AI), specifically the AI branch of knowledge engineering (also known as knowledge representation and reasoning). This branch aims to solve the problem of how to organize human knowledge so that a computer can treat it. Knowledge engineering uses business rules, which means a set of knowledge metaphors that subject matter experts can easily understand and use.

The Decision Model and Notation (DMN) standard recently released a new model and notation for subject matter experts. After years of using different methodologies and tools, we finally have a common language for sharing knowledge representation. A hidden treasure of the DMN is that it makes dealing with machine learning algorithms easier. The connecting link is another well-known standard in data science: The Predictive Model Markup Language, or PMML.

Using these tools to connect knowledge engineering and machine learning empowers both domains, so that the whole is greater than the sum of its parts. It opens up a wide range of use cases where combining deterministic knowledge and data science predictions leads to smarter decisions.

A use case for cooperation

The idea of algorithms that can learn from large sets of data and understand patterns that we humans cannot see is fascinating. However, overconfidence in machine learning technology leads us to underestimate the value of human knowledge.

Let’s take an example from our daily experience: We are all used to algorithms that use our internet browsing history to show us ads for products we’ve already purchased. This happens because it’s quite difficult to train a machine learning algorithm to exclude ads for previously purchased products.

What is a difficult problem for machine learning is very easy for knowledge engineering to solve. On the flip side, encoding all possible relationships between searched words and suggested products is extremely tedious. In this realm, machine learning complements knowledge engineering.

Artificial intelligence has many branches—machine learning, knowledge engineering, search optimization, natural language processing, and more. Why not use more than one technique to achieve more intelligent behavior?

Artificial intelligence, machine learning, and data science

Artificial intelligence, machine learning, and data science are often used interchangeably. Actually, they are different but overlapping domains. As I already noted, artificial intelligence has a broader scope than machine learning. Machine learning is just one facet of artificial intelligence. Similarly, some argue that data science is a facet of artificial intelligence. Others say the opposite, that data science includes AI.

In the field, data scientists and AI experts offer different kinds of expertise with some overlap. Data science uses many machine learning algorithms, but not all of them. The Venn diagram in Figure 1 shows the spaces where artificial intelligence, machine learning, and data science overlap.

Artificial intelligence and data science overlap. Machine learning is a subset of artificial intelligence that overlaps with data science.

Figure 1: The overlaps between artificial intelligence, machine learning, and data science.

Note: See Data Science vs. Machine Learning and Artificial Intelligence for more about each of these technology domains and the spaces where they meet.

Craft your own machine learning model

Data scientists are in charge of defining machine learning models after careful preparation. This section will look at some of the techniques data scientists use to select and tune a machine learning algorithm. The goal is to understand the workflow and learn how to craft a model that can cope with prediction problems.

Note: To learn more about data science methods and processes, see Wikipedia’s Cross-industry standard process for data mining (CRISP-DM) page.

Prepare and train a machine learning algorithm

The first step for preparing and training a machine learning algorithm is to collect, analyze, and clean the data that we will use. Data preparation is an important phase that significantly impacts the quality of the final outcome. Data scientists use mathematics and statistics for this phase.

For simplicity, let’s say we have a reliable data set based on a manager’s historical decisions in an order-fulfillment process. The manager receives the following information: Product type (examples are phone, printer, and so on), price, urgency, and category. There are two categories: Basic, for when the product is required employee equipment, and optional, for when the product is not necessary for the role.

The two decision outcomes are approved or denied. Automating this decision will free the manager from a repetitive task and speed up the overall order-fulfillment process.

As a first attempt, we could take the data as-is to train the model. Instead, let’s introduce a bit of contextual knowledge. In our fictitious organization, the purchasing department has a price-reference table where target prices are defined for all product types. We can use this information to improve the quality of the data. Instead of training our algorithm to focus on the product type, we’ll train it to consider the target price. This way, we won’t need to re-train the model when the reference price list changes.

Choosing a machine learning algorithm

We now have a typical classification problem: Given the incoming data, the algorithm must find a class for those data. In other words, it has to label each data item approved or denied. Because we have the manager’s collected responses, we can use a supervised learning method. We only need to choose the correct algorithm. The major machine learning algorithms are:

Note: For more about each of these algorithms, see
9 Key Machine Learning Algorithms Explained in Plain English.

Except for linear regression, we could apply any of these algorithms to our classification problem. For this use case, we will use a Logistic Regression model. Fortunately, we don’t need to understand the algorithm’s implementation details. We can rely on existing tools for implementation.

Python and scikit-learn

We will use Python and the scikit-learn library to train our Logistic Regression model. We choose Python because it is concise and easy to understand and learn. It is also the de facto standard for data scientists. Many libraries expressly designed for data science are written in Python.

The example project

Before we go further, download the project source code here. Open the python folder to find the machine training code (ml-training.py) and the CSV file we’ll use to train the algorithm.

Even without experience with Python and machine learning, the code is easy to understand and adapt. The program’s logical steps are:

  1. Initialize the algorithm to train.
  2. Read the available data from a CSV file.
  3. Randomly split the training and test data sets (40% is used for testing).
  4. Train the model.
  5. Test the model against the testing data set.
  6. Print the test results.
  7. Save the trained model in PMML.

A nice feature of the scikit-learn library is that its machine learning algorithms expose nearly all the same APIs. You can switch between the available algorithms by changing one line of code. This means you can easily benchmark different algorithms for accuracy and decide which one best fits your use case. This type of benchmarking is common because it’s often hard to know in advance which algorithm will perform better for a use case.

Run the program

If you run the Python program, you should see results similar to the following, but not exactly the same. The training and test data are randomly selected so that the results will differ each time. The point is to verify that the algorithm works consistently across multiple executions.

Results for model LogisticRegression

Correct: 1522

Incorrect: 78

Accuracy: 95.12%

True Positive Rate: 93.35%

True Negative Rate: 97.10%

The results are quite accurate, at 95%. More importantly, the True Negative Rate (measuring specificity) is very high, at 97.1%. In general, there is a tradeoff between the True Negative Rate and True Positive Rate, which measures sensitivity. Intuitively, you can liken the prediction sensitivity to a car alarm: If we increase an alarm’s sensitivity, it is more likely to go off by mistake and increase the number of false positives. The increase in false positives lowers specificity.

Tune the algorithm

In this particular use case, of approving or rejecting a product order, we would reject the order. Manual approval is better than having too many false positives, which would lead to wrongly approved orders. To improve our results, we can adjust the logistic regression to reduce the prediction sensitivity.

Predictive machine learning models are also known as classification algorithms because they place an input dataset in a specific class. In our case, we have two classes:

To reduce the likelihood of a false positive, we can tune the “true” class weight (note that 1 is the default):

model = LogisticRegression(class_weight ={
   "true" : .6,
   "false" : 1
})

Store the model in a PMML file

Python is handy for analysis, but we might prefer another language or product for running a machine learning model in production. Reasons include better performance and integration with the enterprise ecosystem.

What we need is a way to exchange machine learning model definitions between different software. The PMML format is commonly used for this purpose. The DMN specification includes a direct reference to a PMML model, which makes this option straightforward.

You should make a couple of changes to the PMML file before importing it to the DMN editor. First, you might need to change the Python PMML version tag to 4.3, which is the version supported by Decision Manager 7.7 (the current version as of this writing):

<PMML version="4.3" xmlns="http://www.dmg.org/PMML-4_3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

Next, you want to be able to easily identify the predictive model from the DMN modeler. Use the modelName attribute to name your model:

<RegressionModel modelName="approvalRegression" functionName="classification" normalizationMethod="logit">

The diagram in Figure 2 shows where we are currently with this project.

The scikit-learn library requires a training set and an algorithm configuration; the outcome is the PMML model.

Figure 2: A usage block diagram for scikit-learn.

Conclusion

So far, you’ve seen how to create a machine learning model and store it in a PMML file. In the second half of this article, you will learn more about using PMML to store and transfer machine learning models. You’ll also discover how to consume a predictive model from a deterministic decision using DMN. Finally, we’ll review the advantages of creating more cooperation between the deterministic world and the predictive one.

Share

The post Knowledge meets machine learning for smarter decisions, Part 1 appeared first on Red Hat Developer.

January 14, 2021 02:20 AM UTC


Matt Layman

Squashing Bugs - Building SaaS #87

In this episode, I fixed some critical issues that my customer discovered. My customer is putting the app through its real paces for a school year and since this is the first run, there were bound to be some bugs. We began with an explanation of the issues that my customer encountered. The problems related to scheduling. First, the daily page skipped a task and showed the task that was meant for two days in the future.

January 14, 2021 12:00 AM UTC

January 13, 2021


Anarcat

New phone: Pixel 4a

I'm sorry to announce that I gave up on the Fairphone series and switched to a Google Phone (Pixel 4a) running CalyxOS.

Problems in fairy land

My fairphone2, even if it is less than two years old, is having major problems:

Some of those problems are known: the Fairphone 2 is old now. It was probably old even when I got it. But I can't help but feel a little sad to let it go: the entire point of that device was to make it easy to fix. But alas, because it's sold only in Europe, local stores don't carry replacement parts. To be fair, Fairphone did offer to fix the device, but with a 2 weeks turnaround, I had to get another phone anyways.

I did actually try to buy a fairphone3, from Clove. But they did some crazy validation routine. By email, they asked me to provide a photo copy of a driver's license and the credit card, arguing they need to do this to combat fraud. I found that totally unacceptable and asked them to cancel my order. And because I'm not sure the FP3 will fix the coverage issues, I decided to just give up on Fairphone until they officially ship to the Americas.

Do no evil, do not pass go, do not collect 200$

So I got a Google phone, specifically a Pixel 4a. It's a nice device, all small and shiny, but it's "plasticky" - I would have prefered metal, but it seems you need to pay much, much more to get that (in the Pixel 5).

In any case, it's certainly a better form factor than the Fairphone 2: even though the screen is bigger, the device itself is actually smaller and thinner, which feels great. The OLED screen is beautiful, awesome contrast and everything, and preliminary tests show that the camera is much better than the one on the Fairphone 2. (The be fair, again, that is another thing the FP3 improved significantly. And that is with the stock Camera app from CalyxOS/AOSP, so not as good as the Google Camera app, which does AI stuff.)

CalyxOS: success

The Pixel 4a not not supported by LineageOS: it seems every time I pick a device in that list, I manage to miss the right device by one (I bought a Samsung S9 before, which is also unsupported, even though the S8 is). But thankfully, it is supported by CalyxOS.

That install was a breeze: I was hesitant in playing again with installing a custom Android firmware on a phone after fighting with this quite a bit in the past (e.g. htc-one-s, lg-g3-d852). But it turns out their install instructions, mostly using a AOSP alliance device-flasher works absolutely great. It assumes you know about the commandline, and it does require to basically curl | sudo (because you need to download their binary and run it as root), but it Just. Works. It reminded me of how great it was to get the Fairphone with TWRP preinstalled...

Oh, and kudos to the people in #calyxos on Freenode: awesome tech support, super nice folks. An amazing improvement over the ambiance in #lineageos! :)

Migrating data

Unfortunately, migrating the data was the usual pain in the back. This should improve the next time I do this: CalyxOS ships with seedvault, a secure backup system for Android 10 (or 9?) and later which backs up everything (including settings!) with encryption. Apparently it works great, and CalyxOS is also working on a migration system to switch phones.

But, obviously, I couldn't use that on the Fairphone 2 running Android 7... So I had to, again, improvised. The first step was to install Syncthing, to have an easy way to copy data around. That's easily done through F-Droid, already bundled with CalyxOS (including the privileged extension!). Pair the devices and boom, a magic portal to copy stuff over.

The other early step I took was to copy apps over using the F-Droid "find nearby" functionality. It's a bit quirky, but really helps in copying a bunch of APKs over.

Then I setup a temporary keepassxc password vault on the Syncthing share so that I could easily copy-paste passwords into apps. I used to do this in a text file in Syncthing, but copy-pasting in the text file is much harder than in KeePassDX. (I just picked one, maybe KeePassDroid is better? I don't know.) Do keep a copy of the URL of the service to reduce typing as well.

Then the following apps required special tweaks:

I tried to sync contacts with DAVx5 but that didn't work so well: the account was setup correctly, but contacts didn't show up. There's probably just this one thing I need to do to fix this, but since I don't really need sync'd contact, it was easier to export a VCF file to Syncthing and import again.

Known problems

One problem with CalyxOS I found is that the fragile little microg tweaks didn't seem to work well enough for Signal. That was unexpected so they encouraged me to file that as a bug.

The other "issue" is that the bootloader is locked, which makes it impossible to have "root" on the device. That's rather unfortunate: I often need root to debug things on Android. In particular, it made it difficult to restore data from OSMand (see below). But I guess that most things just work out of the box now, so I don't really need it and appreciate the extra security. Locking the bootloader means full cryptographic verification of the phone, so that's a good feature to have!

OSMand still doesn't have a good import/export story. I ended up sharing the Android/data/net.osmand.plus/files directory and importing waypoints, favorites and tracks by hand. Even though maps are actually in there, it's not possible for Syncthing to write directly to the same directory on the new phone, "thanks" to the new permission system in Android which forbids this kind of inter-app messing around.

Tracks are particularly a problem: my older OSMand setup had all those folders neatly sorting those tracks by month. This makes it really annoying to track every file manually and copy it over. I have mostly given up on that for now, unfortunately. And I'll still need to reconfigure profiles and maps and everything by hand. Sigh. I guess that's a good clearinghouse for my old tracks I never use...

Update: turns out setting storage to "shared" fixed the issue, see comments below!

Conclusion

Overall, CalyxOS seems like a good Android firmware. The install is smooth and the resulting install seems solid. The above problems are mostly annoyances and I'm very happy with the experience so far, although I've only been using it for a few hours so this is very preliminary.

January 13, 2021 08:50 PM UTC


Python Pool

Sep in Python | Examples, and Explanation

Hello coders!! In this article, we will cover sep in python. It may happen at times that we want to print formatted multiple values in a Python program. The sep argument in Python comes to play in such scenarios. Without wasting any time, let’s dive straight into the topic.

The sep parameter in Python:

Sep is a parameter in python that primarily formats the printed statements in the output screen. Whitespace is the default value of this parameter. It adds a separator between strings to be printed. Let us see some examples to make our concept clear.

Syntax:

print(argument1, argument2, ..., sep = value)

Example 1: Python sep =”

print("Python", "Pool", sep = '')
Output Python sep =Output

As we can see, when the value of sep is empty, there is no gap between the two statements.

Example 2: Python sep = ‘\n’

color=['red','blue','orange','pink']
print(*color, sep = ", ")  
print()
print(*color, sep = "\n") 
python sep =\nOutput

In this example, when we use the sep value ‘, ‘ the list’s values are printed in a comma-separated fashion. When the value of sep is ‘\n,’ i.e., newline, the list’s value is printed in a new line every time.

Example 3: Joining a list with a separator in Python

colors=['red','blue','orange','pink']
s="_".join(colors)
print(s)
red_blue_orange_pink

In this particular example, we first declared a list of colors containing four values: red, blue, orange, and pink. We then declared the sep value as ‘ _’. When we joined the list using that separator, we can see that in the output the value of the list is printed with the separator.

Example 4: Parsing a string in python with sep

txt = "Python, a programming language, is easy to understand"
print(txt.split(", "))
['Python', 'a programming language', 'is easy to understand']

As we can see here, the value of the separator is a comma. As a result, the string is split at the places where there is a presence of a comma in the sentence.

Difference between sep and end:

endsep
prints once all the values in the given print statement is printedseparates the print value by inserting the given value between them
Example:
st1=’python’
st2=’pool’
print(st1,st2,end=’%’)

Output:
python pool%
Example:
st1=’python’
st2=’pool’
print(st1,st2,sep=’%’)

Output:
python%pool

Conclusion:

With this, we come to an end to this article. The concept of sep for print statement formatting is relatively easy and simple. It finds major use in coding all over.

However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.

Happy Pythoning!

The post Sep in Python | Examples, and Explanation appeared first on Python Pool.

January 13, 2021 03:07 PM UTC

What is cv2 imshow()? Explained with examples

Hello geeks and welcome in this article, we will cover cv2 imshow(). Along with that, for an overall better understanding, we will also look at its syntax and parameter. Then we will see the application of all the theory part through a couple of examples. The cv2 is a cross-platform library designed to solve all computer vision-related problems. We will look at its application and work later in this article. But first, let us try to get an overview of the function through its definition.  

The function cv2 imshow() is used to add an image in the window. The window itself adjusts to the size of the image. In the section, we will look at the syntax associated with this function.

SYNTAX

cv2.imshow(window_name, image)

This is the general syntax for our function. In the next section we will look at the various parameters associated with it.

PARAMETER

1. Window_name:

This parameter represents the name of window in which the image needs to be displayed.

2. image:

This parameter represents the image that we want to be displayed.

HOW TO DOWNLOAD CV2 ON YOUR MACHINE?

So far, we have covered the syntax, parameter, and basic definition of the cv2 library. But to actually start working with it, we first need to have cv2 installed on our machine. In this, we will discuss so. I assume you all have the latest python version installed on your machines. So open the command prompt or terminal as per the operating system you are using. In that type, “Python,” it will show you the python version you are using. Next, in that use command “pip install OpenCV-python,” it will download this for you. Along with that, it will c also download you the NumPy library. Once done with that, you can check for the version and whether the installation is successful or not, as shown below.

cv2 imshow()

EXAMPLES

1.Basic example for cv2.imshow()

import cv2

img = cv2.imread('10.jpeg',0)
print(img)

Output:

[[ 52  51  38 ...  33  86  34]
 [ 58  61  49 ...  67  36  19]
 [ 42  53  48 ...  20  17  45]
 ...
 [ 37  35  33 ... 127 120 113]
 [ 41  39  37 ... 118 117 114]
 [ 42  41  41 ... 113 112 111]]

Here to perform the above example I have used “Pycharm IDE” if you wish you can use the same. If you try to run the same code on your system it won’t as the image is stored locally. Now coming to the example here we have used at first imported the cv2 library then used our syntax. Here on using the print statement, it prints a matrix instead of our image. This happens because we have used the cv2 imread() function over here which reads the image and that’s why we get this as output.

2. Printing image with help of cv2 imshow()

import cv2

img = cv2.imread('10.jpeg',1)
cv2.imshow("sample",img)
cv2.waitKey(5000)

Output:

cv2 imshow

In this section, we are finally able to print our image. Here we have followed the same steps, but cv2 imshow() makes all the difference here. We have carried ahead from the program of the 1st example. Here we have used the Cv2 waitkey(), which stops our image from disappearing quickly. Here one more thing that we can note is that in imread(), I have specified “1,” which prints a color full image. But if you use “0” instead of 1 there, you get a black and white image.

Try it out yourself and tell me how things went. It is really fun thing to do.

Basic Faq’s regarding cv2 imshow()

Q. What is the size of cv2 imshow?

Ans. Cv2 imshow has no specific size. It adopts to the size of the image to be used for that we want to display.

Q. Is it possible to use cv2 imshow with out wait key?

Ans. Yes, it is possible to use the Imshow function without waitkey. But in this case, what will happen is that image will pop and vanish within the span of Nanoseconds, and you will not be able to notice anything. That’s why it is advised to use the wait key when dealing with Imshow().

Q. cv2 imshow not working?

Ans. One of the biggest reasons for not working on the CV2 Imshow is not using the wait key. Although your program may be correct in such cases since nothing appears on the screen, a general doubt regarding the code pops up. That’s why it is advised to use waitkey().

You might be interested in reading >> How to Display Images Using Matplotlib Imshow Function

CONCLUSION

In this article, we covered the cv2 imshow(). Besides that, we have also looked at its syntax and arguments. For better understanding, we looked at a couple of examples. We varied the syntax and looked at the output for each case. In the end, we can conclude that it helps us in displaying an image in the window.

I hope this article was able to clear all doubts. But in case you have any unsolved queries feel free to write them below in the comment section. Done reading this, why not read about the repeat function next.

The post What is cv2 imshow()? Explained with examples appeared first on Python Pool.

January 13, 2021 03:07 PM UTC

Matplotlib pcolormesh in Python with Examples

Hello coders!! In this article, we will be learning about Matplotlib pcolormesh in Python. The Matplotlib library in Python is numerical for NumPy library. Pyplot is a library in Matplotlib, which is basically a state-based interface that provides MATLAB-like features. Let us discuss the topic in detail.

Syntax:

matplotlib.pyplot.pcolormesh(*argsalpha=Nonenorm=Nonecmap=Nonevmin=Nonevmax=Noneshading=Noneantialiased=Falsedata=None**kwargs)

Call Signature:

pcolormesh([X, Y,] C, **kwargs)

Parameters:

Return value of matplotlib pcolormesh:

mesh : matplotlib.collections.QuadMesh

Example of Matplotlob Pcolormesh in Python:

import matplotlib.pyplot as plt 
import numpy as np 
from matplotlib.colors import LogNorm 
Z = np.random.rand(5, 5) 
plt.pcolormesh(Z) 

plt.title("Matplotlib pcolormesh") 
plt.show()

Output:

Example of Matplotlob Pcolormesh in Python:Output

Explanation:

In this code, we first imported the pyplot library of the matplotlib module of Python to avail its MATLAB-like plotting framework. Next, we imported the NumPy module for array functions. Lastly, the lognorm library of the matplotlib.colors for colormap normalizations.

We used the random.rand() function to create random values in a given shape. We then used the pcolormesh() method to create a pseudocolor plot with a non-regular rectangular grid.

Lastly, we added the title and displayed the graph.

Matplotlib.axes.Axes.pcolormesh() in Python:

Most of the figure elements are contained in the axes class, like:

and many more. It sets the coordinate system.

The Axes.pcolormesh() function in the matplotlib axes library is used to create a plot with pseudocolor having a non-regular rectangular grid.

import matplotlib.pyplot as plt 
import numpy as np 
from matplotlib.colors import LogNorm 
Z = np.random.rand(5, 5) 
fig, ax = plt.subplots() 
ax.pcolormesh(Z) 
ax.set_title('Matplotlib Axes Pcolormesh') 
plt.show() 

Output:

Axes.pcolormesh()Output

Matplotlib pcolormesh grid and shading:

Let Z have shape (M,N)

So, the grid X and Y can have shape either(M+1,N+1) or (M,N). This depends on the shading keyword.

1) Flat Shading With Matplotlib Pcolormesh:

rows = 5
cols = 5
Z = np.arange(rows * cols).reshape(rows, cols)
x = np.arange(cols + 1)
y = np.arange(rows + 1)

fig, ax = plt.subplots()
ax.pcolormesh(x, y, Z, shading='flat', vmin=Z.min(), vmax=Z.max())
ax.set_title('Flat Shading'

Output:

Flat Shading With MatplotlibOutput

Explanation:

Here, we have set the least assumption shading as flat. The grid is also incremented by one in each dimension, i.e., it has a shape (M+1, N+1). In such cases, the value of x and y signify the quadrilateral corners, colored with the value of Z. Here we the edges of the quadrilateral (5,5) are specified as with X and Y that are (6, 6).

2) Gouraud Shading With Matplotlib Pcolormesh:

cols=5
rows=5
fig, ax = plt.subplots(constrained_layout=True)
Z = np.arange(rows * cols).reshape(rows, cols)
x = np.arange(cols)
y = np.arange(rows)
ax.pcolormesh(x, y, Z, shading='gouraud', vmin=Z.min(), vmax=Z.max())
ax.set_title('Gouraud Shading')

plt.show()

Output:

Gouraud ShadingOutput

Explanation:

In Gouraud shading, the colors linearly interpolate between the quadrilateral grid points. Also, the shape of X, Y, Z must be the same.

Must Read

Conclusion: Matplotlib pcolormesh

With this, we come to an end to the article. We learned about pcolormesh in Matplolib and also saw its various examples. The shading can also be changed as one requires.

However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.

Happy Pythoning!

The post Matplotlib pcolormesh in Python with Examples appeared first on Python Pool.

January 13, 2021 03:06 PM UTC


Mike Driscoll

Getting GPS EXIF Data with Python

Did you know that you can get EXIF data from JPG image files using the Python programming language? You can use Pillow, the Python Imaging Library’s friendly fork to do so. You can read an article about that on this website if you want to.

Here is some example code for getting regular EXIF data from a JPG file:

# exif_getter.py

from PIL import Image
from PIL.ExifTags import TAGS


def get_exif(image_file_path):
    exif_table = {}
    image = Image.open(image_file_path)
    info = image.getexif()
    for tag, value in info.items():
        decoded = TAGS.get(tag, tag)
        exif_table[decoded] = value
    return exif_table


if __name__ == "__main__":
    exif = get_exif("bridge.JPG")
    print(exif)

This code was run using the following image:

Mile long bridge

In this article, you will focus on how to extract GPS tags from an image. These are special EXIF tags that are only present if the camera that took the photo had its location information turned on for the camera. You can also add GPS tags on your computer after the fact.

For example, I added GPS tags to this photo of Jester Park, which is in Granger, IA:

To get access to those tags, you’ll need to take the earlier code example and do some minor adjustments:

# gps_exif_getter.py

from PIL import Image
from PIL.ExifTags import TAGS, GPSTAGS


def get_exif(image_file_path):
    exif_table = {}
    image = Image.open(image_file_path)
    info = image.getexif()
    for tag, value in info.items():
        decoded = TAGS.get(tag, tag)
        exif_table[decoded] = value

    gps_info = {}
    for key in exif_table['GPSInfo'].keys():
        decode = GPSTAGS.get(key,key)
        gps_info[decode] = exif_table['GPSInfo'][key]

    return gps_info


if __name__ == "__main__":
    exif = get_exif("jester.jpg")
    print(exif)

To get access to the GPS tags, you need to import GPSTAGS from PIL.ExifTags. Then after parsing the regular tags from the file, you add a second loop to look for the “GPSInfo” tag. If that’s present, then you have GPS tags that you can extract.

When you run this code, you should see the following output:

{'GPSLatitudeRef': 'N',
 'GPSLatitude': (41.0, 47.0, 2.17),
 'GPSLongitudeRef': 'W',
 'GPSLongitude': (93.0, 46.0, 42.09)}

You can take this information and use it to load a Google map with Python or work with one of the popular GIS-related Python libraries.

Related Reading

The post Getting GPS EXIF Data with Python appeared first on Mouse Vs Python.

January 13, 2021 02:42 PM UTC


Real Python

Sentiment Analysis: First Steps With Python's NLTK Library

Once you understand the basics of Python, familiarizing yourself with its most popular packages will not only boost your mastery over the language but also rapidly increase your versatility. In this tutorial, you’ll learn the amazing capabilities of the Natural Language Toolkit (NLTK) for processing and analyzing text, from basic functions to sentiment analysis powered by machine learning!

Sentiment analysis can help you determine the ratio of positive to negative engagements about a specific topic. You can analyze bodies of text, such as comments, tweets, and product reviews, to obtain insights from your audience. In this tutorial, you’ll learn the important features of NLTK for processing text data and the different approaches you can use to perform sentiment analysis on your data.

By the end of this tutorial, you’ll be ready to:

  • Split and filter text data in preparation for analysis
  • Analyze word frequency
  • Find concordance and collocations using different methods
  • Perform quick sentiment analysis with NLTK’s built-in classifier
  • Define features for custom classification
  • Use and compare classifiers for sentiment analysis with NLTK

Free Bonus: Click here to get our free Python Cheat Sheet that shows you the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

Getting Started With NLTK

The NLTK library contains various utilities that allow you to effectively manipulate and analyze linguistic data. Among its advanced features are text classifiers that you can use for many kinds of classification, including sentiment analysis.

Sentiment analysis is the practice of using algorithms to classify various samples of related text into overall positive and negative categories. With NLTK, you can employ these algorithms through powerful built-in machine learning operations to obtain insights from linguistic data.

Installing and Importing

You’ll begin by installing some prerequisites, including NLTK itself as well as specific resources you’ll need throughout this tutorial.

First, use pip to install NLTK:

$ python3 -m pip install nltk

While this will install the NLTK module, you’ll still need to obtain a few additional resources. Some of them are text samples, and others are data models that certain NLTK functions require.

To get the resources you’ll need, use nltk.download():

import nltk

nltk.download()

NLTK will display a download manager showing all available and installed resources. Here are the ones you’ll need to download for this tutorial:

  • names: A list of common English names compiled by Mark Kantrowitz
  • stopwords: A list of really common words, like articles, pronouns, prepositions, and conjunctions
  • state_union: A sample of transcribed State of the Union addresses by different US presidents, compiled by Kathleen Ahrens
  • twitter_samples: A list of social media phrases posted to Twitter
  • movie_reviews: Two thousand movie reviews categorized by Bo Pang and Lillian Lee
  • averaged_perceptron_tagger: A data model that NLTK uses to categorize words into their part of speech
  • vader_lexicon: A scored list of words and jargon that NLTK references when performing sentiment analysis, created by C.J. Hutto and Eric Gilbert
  • punkt: A data model created by Jan Strunk that NLTK uses to split full texts into word lists

Note: Throughout this tutorial, you’ll find many references to the word corpus and its plural form, corpora. A corpus is a large collection of related text samples. In the context of NLTK, corpora are compiled with features for natural language processing (NLP), such as categories and numerical scores for particular features.

A quick way to download specific resources directly from the console is to pass a list to nltk.download():

>>>
>>> import nltk

>>> nltk.download([
...     "names",
...     "stopwords",
...     "state_union",
...     "twitter_samples",
...     "movie_reviews",
...     "averaged_perceptron_tagger",
...     "vader_lexicon",
...     "punkt",
... ])
[nltk_data] Downloading package names to /home/user/nltk_data...
[nltk_data]   Unzipping corpora/names.zip.
[nltk_data] Downloading package stopwords to /home/user/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package state_union to
[nltk_data]     /home/user/nltk_data...
[nltk_data]   Unzipping corpora/state_union.zip.
[nltk_data] Downloading package twitter_samples to
[nltk_data]     /home/user/nltk_data...
[nltk_data]   Unzipping corpora/twitter_samples.zip.
[nltk_data] Downloading package movie_reviews to
[nltk_data]     /home/user/nltk_data...
[nltk_data]   Unzipping corpora/movie_reviews.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/user/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /home/user/nltk_data...
[nltk_data] Downloading package punkt to /home/user/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
True

This will tell NLTK to find and download each resource based on its identifier.

Should NLTK require additional resources that you haven’t installed, you’ll see a helpful LookupError with details and instructions to download the resource:

>>>
>>> import nltk

>>> w = nltk.corpus.shakespeare.words()
...
LookupError:
**********************************************************************
  Resource shakespeare not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('shakespeare')
...

The LookupError specifies which resource is necessary for the requested operation along with instructions to download it using its identifier.

Compiling Data

Read the full article at https://realpython.com/python-nltk-sentiment-analysis/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

January 13, 2021 02:00 PM UTC


Python Pool

Exciting FizzBuzz Challenge in Python With Solution

There are thousands of python learning platform where you can practice your Python coding skills. These platforms contain some of the best problems which you can ever imagine. The programs are separated into several categories depending on their topic category and difficulty level. These platforms definitely help you learn new things and improve your coding practices. In this post, we’ll go through the solutions of FizzBuzz Python.

FizzBuzz Python is a popular python question in HackerRank and HackerEarth learning platforms. Both the platforms have the same problem statement and are very special for new programmers. The program asks you to print “Fizz” for the multiple of 3, “Buzz” for the multiple of 5, and “FizzBuzz” for the multiple of both. In both the platforms, the best optimal solution for the program is expected, which takes the lowest time to execute.

In this post, we’ll go through all of the solutions in all languages, including python 2 and python 3.

What exactly is the FizzBuzz Python Problem Statement?

The exact wordings of the problem goes as –

Print every number from 1 to 100 (both included) on a new line. Numbers which are multiple of 3, print “Fizz” instead of a number. For the numbers which are multiples of 5, print “Buzz” instead of a number. For the number which is multiple of both 3 and 5, print “FizzBuzz” instead of numbers.

Problem statement seems very easy for an everyday programmer. But from a newbie’s perspective, this program tests the skills regarding loops and conditionals. Let’s have a look at the constraints given for the answers to be acceptable.

Constraints for the FizzBuzz Problem

Constraints are the limiting factors within which your code must comply. These constraints are made to identify better codes with minimum time complexity and better memory management. Following are the constraints for the FizzBuzz Python problem –

  1. Time Limit: 5 seconds
  2. Memory Limit: 256 MB
  3. Source Limit: 1024KB
  4. Scoring System: (200 – number of characters in source code)/100 [Only for python solutions]

Hints For FizzBuzz Python Problem

There are multiple ways to solve the FizzBuzz Python problem. If you want hints for the same here, they are –

Hint 1: Create a “for” loop with range() function to create a loop of all numbers from 1 to 100. Before implementing FizzBuzz, create this simple loop to understand the looping.

Hint 2: To check the number is a multiple of any number, check the remainder of the number with the divisor. If the remainder turns out to be 0, then it’s multiple of the corresponding number. For example, 15 leaves remainder 0 when divided by 5. This confirms that 15 is a multiple of 5. Use the same logic to create a logical conditional.

Hint 3: In conditional statements, put the multiple of 15 cases on top of 5 or 3. Because if the number is a multiple of 15, it’ll always be a multiple of 3 and 5. Implementing this will check for the FizzBuzz case first.

FizzBuzz Python 3 Solution

Solution for FizzBuzz problem in Python 3 –

for num in range(1, 101):
    if num % 15 == 0:
        print("FizzBuzz")
    elif num % 3 == 0:
        print("Fizz")
    elif num % 5 == 0:
        print("Buzz")
    else:
        print(num)

Output –

FizzBuzz Python output

Explanation –

Firstly, we declare a loop that ranges from 1 to 100. As the range() function loops till inclusive integer, we’ve used 101. We’ve used the if statements from the next block to check if the multiplicity of every number. If it is divisible by 15, print “FizzBuzz,” if it’s divisible by 3, print “Fizz” if it’s divisible by 5, print “Buzz.” All these conditionals are combined by using if and elif blocks. This looping goes on until it reaches 100.

FizzBuzz Python 2 Solution

Solution for FizzBuzz problem in Python 2 –

for num in range(1, 101):
    if num % 15 == 0:
        print "FizzBuzz"
    elif num % 3 == 0:
        print "Fizz"
    elif num % 5 == 0:
        print "Buzz"
    else:
        print num

Explanation –

Explanation follows the same for python 2. The only difference being that the print function works without parenthesis.

Fizzbuzz Python One Liner Solution

Code:

for i in range(1, 101): print("Fizz"*(i%3==0)+"Buzz"*(i%5==0) or str(i))

Explanation:

Python supports one-liner for loops included with conditional statements. FizzBuzz is a perfect problem where you can code the entire solution in one line. Using loops and conditionals in one line, you can score maximum points.

Solutions for FizzBuzz in Other Languages

Solving FizzBuzz Problem In C++

#include <iostream>
using namespace std;
int main()
{
    for(int i=1;i<=100;i++){
        if((i%3 == 0) &amp;&amp; (i%5==0))
            cout<<"FizzBuzz\n";
        else if(i%3 == 0)
            cout<<"Fizz\n";
        else if(i%5 == 0)
            cout<<"Buzz\n";
        else
            cout<<i<<"\n";
     }
    return 0;
}

Solving FizzBuzz Problem in Java 8

import java.io.*;
import java.util.*;
 
public class Solution {
    public static void main(String[] args) {
        int x = 100; 
        for(int i = 1; i <= x; i++){
            if(i % 3 == 0 &amp;&amp; i % 5 ==0){
                System.out.println("FizzBuzz");     
            }
            else if(i % 5 == 0){
                System.out.println("Buzz");
            }
            else if(i % 3 ==0){
                System.out.println("Fizz");
            }
            else{
                System.out.println(i);
            }
        }
    }
}

FizzBuzz Problem In Go

package main
 
import "fmt"
 
func main() {
        
    for i := 1; i <= 100; i++ {
		if i%15==0 {
			fmt.Printf("FizzBuzz\n")
		} else if i%3 == 0 {
			fmt.Printf("Fizz\n")
		} else if i%5 == 0 {
			fmt.Printf("Buzz\n")
		} else {
			fmt.Printf("%d\n", i)
		}
    }	
}

Solving FizzBuzz Problem In Javascript (NodeJS v10)

process.stdin.resume();
process.stdin.setEncoding("utf-8");
var stdin_input = "";
process.stdin.on("data", function (input) {
    stdin_input += input; // Reading input from STDIN
});
 
process.stdin.on("end", function () {
   main(stdin_input);
});
function main(input) {
	var str;
	var i=1;
	while(i<=input){
		str='';
		if(i%3===0){
			str+='Fizz';
		}
		if(i%5===0){
			str+='Buzz';
		}
		str!=='' ? process.stdout.write(str+"\n") : process.stdout.write(i+"\n");
		i++;
	}
}

Solving FizzBuzz Problem In PHP

<?php
for ($i = 1; $i <= 100; $i++) {
	if (($i % 3) == 0)
		echo "Fizz";
	if (($i % 5) == 0)
		echo "Buzz";
	if (($i % 3) != 0 &amp;&amp;  ($i % 5) != 0)
		echo $i;
	echo "\n";
}
?>

R (RScript 3.4.0)

for (i in 1:100){
  if (i%%3 == 0)
    if (i%%5 == 0)
      cat("FizzBuzz\n")
    else
      cat("Fizz\n")
  else
    if (i%%5 == 0)
      cat("Buzz\n")
    else
      cat(paste(i,"\n"))
}

You might be also interested in reading:

Conclusion

There are thousands of awesome problems that test your basic knowledge in the world of coding. These problems not only help you to learn to code but also improves your logical thinking. Hence, you should always practice coding problems even if you are in a job. There is no harm in learning more everything. To summarize, the FizzBuzz problem tests your basic coding knowledge.

Enjoy Learning and Enjoy Coding!

The post Exciting FizzBuzz Challenge in Python With Solution appeared first on Python Pool.

January 13, 2021 12:59 PM UTC


Python Bytes

#216 Container: Sort thyself!

<p>Sponsored by Datadog: <a href="http://pythonbytes.fm/datadog"><strong>pythonbytes.fm/datadog</strong></a></p> <p>Special guest: <a href="https://twitter.com/Jousefm2">Jousef Murad</a>, Engineered Mind podcast (<a href="https://overcast.fm/itunes1510183304/engineered-mind-podcast-engineering-ai-neuroscience">audio</a>, <a href="https://www.youtube.com/watch?v=l_h-wkpNoW4">video</a>)</p> <a href='https://www.youtube.com/watch?v=Jc_VSHpBM7Y' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>Brian #1:</strong> <strong>pip search. Just don’t.</strong></p> <ul> <li><code>pip search [query]</code> is supposed to “Search for PyPI packages whose name or summary contains [query]”</li> <li>The search feature looks like it’s going to be removed and the PyPI api for it removed.</li> <li><strong>Alternative, and better approach, just manually look at pypi.org and search for stuff.</strong> </li> <li>Right now it does this:</li> </ul> <pre><code> $ pip search pytest ERROR: Exception: Traceback (most recent call last): ... [longish traceback ommited] --- xmlrpc.client.Fault: [Fault -32500: "RuntimeError: PyPI's XMLRPC API has been temporarily disabled due to unmanageable load and will be deprecated in the near future. See https://status.python.org/ for more information."] </code></pre> <ul> <li>The <a href="https://status.python.org/">Python Infrastructure status page</a> says, as of Jan 12: “<strong>Update</strong> - The XMLRPC Search endpoint remains disabled due to ongoing request volume. As of this update, there has been no reduction in inbound traffic to the endpoint from abusive IPs and we are unable to re-enable the endpoint, as it would immediately cause PyPI service to degrade again.”</li> <li>This started becoming a problem in mid December.</li> <li>The endpoint was just never architected to handle the scale it’s getting now. </li> <li>There’s a current issue <a href="https://github.com/pypa/pip/issues/5216">“Remove the pip search command”</a>, open on pip. <ul> <li>The commend thread is locked now, but you can read some of the history.</li> </ul></li> <li>I personally don’t understand the need to hammer search with a CI system or other. <ul> <li>Probably should be using a local cache or local pypi mirror for an active/aggressive CI system.</li> </ul></li> <li>If you have scripts or jobs that run <code>pip search</code> , it ain’t gonna work, so probably best to remove that.</li> </ul> <p><strong>Michael #2:</strong> <a href="http://qpython.com/"><strong>QPython - Scripting for Android with Python</strong></a></p> <ul> <li>Python REPL on Android - interesting</li> <li>Scripting Android tasks with Python - more interesting</li> <li>Free, open source app that is ad supported.</li> <li>Some people have commented that their phone is their only “computer”</li> <li>With SL4A features, you can use Python programming to control Android work: <ul> <li>Android Apps API, such as: Application, Activity, Intent &amp; startActivity, SendBroadcast, PackageVersion, System, Toast, Notify, Settings, Preferences, GUI</li> <li>Android Resources Manager, such as: Contact, Location, Phone, Sms, ToneGenerator, WakeLock, WifiLock, Clipboard, NetworkStatus, MediaPlayer</li> <li>Third App Integrations, such as: Barcode, Browser, SpeechRecongition, SendEmail, TextToSpeech</li> <li>Hardwared Manager: Carmer, Sensor, Ringer &amp; Media Volume, Screen Brightness, Battery, Bluetooth, SignalStrength, WebCam, Vibrate, NFC, USB</li> </ul></li> </ul> <p><strong>Jousef #3:</strong> <strong>Thesis: Deep Learning assistant for designers/engineers</strong> </p> <ul> <li><a href="https://pytorch3d.org/">PyTorch (3D)</a> / TensorFlow</li> <li>The thesis: what is it actually about &amp; goal of the thesis</li> <li>Libraries mainly used: numpy, pandas</li> <li>(Reinforcement Learning &amp; GANs)</li> </ul> <p><strong>Brian #4:</strong> <a href="http://www.grantjenks.com/docs/sortedcontainers/index.html"><strong>sortedcontainers</strong></a></p> <ul> <li>Thanks to Fanchen Bao for the topic suggestion.</li> <li>Pure-Python, as fast as C-extensions, sorted collections library.</li> </ul> <pre><code> &gt;&gt;&gt; from sortedcontainers import SortedList &gt;&gt;&gt; sl = SortedList(['e', 'a', 'c', 'd', 'b']) &gt;&gt;&gt; sl SortedList(['a', 'b', 'c', 'd', 'e']) &gt;&gt;&gt; sl *= 10_000_000 &gt;&gt;&gt; sl.count('c') 10000000 &gt;&gt;&gt; sl[-3:] ['e', 'e', 'e'] &gt;&gt;&gt; from sortedcontainers import SortedDict &gt;&gt;&gt; sd = SortedDict({'c': 3, 'a': 1, 'b': 2}) &gt;&gt;&gt; sd SortedDict({'a': 1, 'b': 2, 'c': 3}) &gt;&gt;&gt; sd.popitem(index=-1) ('c', 3) &gt;&gt;&gt; from sortedcontainers import SortedSet &gt;&gt;&gt; ss = SortedSet('abracadabra') &gt;&gt;&gt; ss SortedSet(['a', 'b', 'c', 'd', 'r']) &gt;&gt;&gt; ss.bisect_left('c') 2 </code></pre> <ul> <li>“All of the operations shown above run in faster than linear time.”</li> <li>Types: <ul> <li>SortedList</li> <li>SortedKeyList (like SortedList, but you pass in a key function, similar to key in Pythons <code>sorted</code> function.)</li> <li>SortedDict</li> <li>SortedSet</li> </ul></li> <li><a href="http://www.grantjenks.com/docs/sortedcontainers/index.html">Great documentation</a> and tons of performance metrics in the docs.</li> </ul> <p><strong>Michael #5:</strong> <a href="https://twitter.com/brianokken/status/1345438719721918464?cn=ZmxleGlibGVfcmVjcw%3D%3D&amp;refsrc=email"><strong>Łukasz Langa Typed Twitter Thread</strong></a></p> <ul> <li>Let’s riff on typing for a bit. </li> <li>Here is my philosophy: If I have to type more than three characters to complete a symbol in my editor, something is wrong. </li> <li>e.g. to go from <code>email_service.</code> → <code>email_service.send_account_email()</code> I should only need to type <code>.sae</code> then tab/enter. These types of things are vastly better because of type hints.</li> <li>Python type hints are more malleable than even TypeScript.</li> <li>Lukasz is addressing this comment: <em>Controversial take: Types in a Python code-base are a net negative</em>.</li> <li>Points <ul> <li>put enough annotations and tooling connects the dots, making plenty of errors evident.</li> <li>The most common to me at least is when a None creeps in. </li> <li>The second bug often caught by type checkers is on the "return" boundary: one of your code paths forgets a return.</li> <li>squiggly lines in your editor</li> <li>Microsoft is now developing powerful type checking and code completion for Python in VSCode. This effort employs a member of the Python Steering Council, and possibly also the creator of Python himself soon. You think they would settle for "illusion of productivity"?</li> </ul></li> </ul> <p><strong>Jousef #6:</strong> </p> <ul> <li>Point Cloud operations → <a href="http://www.open3d.org/">open3d</a></li> </ul> <p>Extras:</p> <p>Michael:</p> <ul> <li>via Francisco Giordano Silva: On Brian's ref to using numpy all for array element-wise comparison, also please check out <code>numpy.allclose</code> method. Allows you to compare two arrays based on a given tolerance.</li> </ul> <p>Brian: </p> <ul> <li>Just this: 2021 is exhausting so far.</li> <li><a href="https://testandcode.com/">Test &amp; Code</a> has shifted to every other week to allow time for other projects I’m working on. <ul> <li>This is probably a short term change. But I don’t know for how long. It’s definitely not going away though. Just slowing down a bit.</li> </ul></li> </ul> <p>Jousef: <a href="https://overcast.fm/itunes1510183304/engineered-mind-podcast-engineering-ai-neuroscience">Engineered Mind podcast</a></p>

January 13, 2021 08:00 AM UTC


Brett Cannon

Unravelling assertions

In this post, as part of my series on Python's syntactic sugar, I'm going to cover assert statements. Now, the actual unravelling of the syntax for assert a, b is already given to us by the language reference:

if __debug__:
    if not a:
        raise AssertionError(b)
Implementation of assert a, b

Since there isn't much to it, I'm going to spend this post mostly explaining what's going on with this unravelled code since there's a couple of details that you might not know.

To begin, __debug__ represents whether CPython was run with -O or -OO. These flags control the optimization level of CPython (hence the use of the letter "O" for them; saying it twice basically represents "more optimizations"). Without either flag specified, __debug__ == True. But if you use either -O or -OO then __debug__ == False (-OO also strips out docstrings to save a tiny bit of memory).

The next thing to be aware of is that the error message argument is only executed if the assertion fails. This is why you can't just define an assert_() function for the unravelling; like with and and or, you have to make sure not to execute b unless not a is true (and since raise is a statement, we can't use our conditional operator trick like we did with and and or).

The unravelling is actually a bit simplistic because technically the lookup of __debug__ and AssertionError is done directly from the builtins module, not from the scope of the statement. So if you want to be really accurate in the unravelling, it's:

import builtins

if builtins.__debug__:
    if not a:
        raise builtins.AssertionError(b)
A more accurate implementation of assert a, b

Finally, you might be wondering why I didn't write this as a single if statement: if __debug__ and not a: raise AssertionError(b). Semantically they are equivalent, but in the bytecode that Python produces they are vastly different. Look at the bytecode for the unravelling proposed at the start:

4           0 LOAD_CONST               0 (None)
            2 RETURN_VALUE

Compare that to the bytecode for if not a and __debug__: raise AssertionError(b) (and I chose that order for the and to prove a point which you will see in a moment):

  2           0 LOAD_GLOBAL              0 (a)
              2 POP_JUMP_IF_TRUE        16
              4 LOAD_CONST               1 (False)
              6 POP_JUMP_IF_FALSE       16

  3           8 LOAD_GLOBAL              1 (AssertionError)
             10 LOAD_GLOBAL              2 (b)
             12 CALL_FUNCTION            1
             14 RAISE_VARARGS            1
        >>   16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

One is a bit shorter than the other. 😉 Python's peephole optimizer notices if __debug__: ... and will completely drop the statement if you run Python with -O or higher (and just note that assert statements are dropped by the peephole optimizer as well). Otherwise the peephole optimizer will replace __debug__ with the appropriate literal. That means there's a bit of wasted effort with if not a and __debug__: ... since not a will be evaluated before __debug__. And even with the reverse order to and you still have to deal with the test and jump for the if conditional guard.

And that's it! And I would like to leave you with a piece of advice when it comes to assert statements: never put actual logic that you couldn't stand to have not run in an assert! Some of you might be 🙄, but I know of a multi-billion dollar company with a lot of Python which couldn't use -O because it broke their code due to the removal of assert statements (I don't know if they ever fixed that issue).

January 13, 2021 04:26 AM UTC

January 12, 2021


Zato Blog

How to integrate API systems in Python

With the immenent release of Zato 3.2, we are happy today to announce the availability of a new API integrations tutorial. Let's quickly check what it offers.

The tutorial is completely new and by following it, you will learn all of:

In short, after you complete it, you will acquire the core of what is needed to use Python to integrate complex API and backend systems.

Zato tutorial

# -*- coding: utf-8 -*-
# zato: ide-deploy=True

# Zato
from zato.server.service import Service

class GetUserDetails(Service):
    """ Returns details of a user by the person's ID.
    """
    name = 'api.user.get-details'

    def handle(self):

        # For later use
        user_name = self.request.payload['user_name']

        # Get data from CRM ..
        crm_data = self.invoke_crm(user_name)

        # .. extract the CRM information we are interested in ..
        user_type = crm_data['UserType']
        account_no = crm_data['AccountNumber']

        # .. get data from Payments ..
        payments_data = self.invoke_payments(user_name, account_no)

        # .. extract the CRM data we are interested in ..
        account_balance = payments_data['ACC_BALANCE']

        # .. optionally, notify the fraud detection system ..
        if self.should_notify_fraud_detection(user_type):
            self.notify_fraud_detection(user_name, account_no)

        # .. now, produce the response for our caller.
        self.response.payload = {
          'user_name': user_name,
          'user_type': user_type,
          'account_no': account_no,
          'account_balance': account_balance,
      }

Zato tutorial

Zato tutorial

Click here to get started - and remember to visit our Twitter, GitHub and Gitter communities as well.

January 12, 2021 11:00 PM UTC


PyCoder’s Weekly

Issue #455 (Jan. 12, 2021)

#455 – JANUARY 12, 2021
View in Browser »

The PyCoder’s Weekly Logo


Advent of Code 2020 “Pytudes”

Google researcher Peter Norvig goes through a suite of short Python programs and exercises for perfecting particular programming skills.
PETER NORVIG

NumPy Tutorial: Your First Steps Into Data Science in Python

Learn everything you need to know to get up and running with NumPy, Python’s de facto standard for multidimensional data arrays. NumPy is the foundation for most data science in Python, so if you’re interested in that field, then this is a great place to start.
REAL PYTHON

Rapidly Troubleshoot Your Python Application in Real Time With Datadog APM

alt

Datadog’s Continuous Profiler analyzes code performance enabling you to identify the most resource-consuming parts in your application code in order to improve MTTR and enhance user experience. See which services or calls are contributing to overall latency and optimize your app performance →
DATADOG sponsor

Implementing FastAPI Services: Abstraction and Separation of Concerns

This article introduces an approach to structure FastAPI applications with multiple services in mind. The proposed structure decomposes the individual services into packages and modules, following principles of abstraction and separation of concerns.
CAMILLO VISINI • Shared by Camillo Visini

Visual Intro to NumPy and Data Representation

“In this post, we’ll look at some of the main ways to use NumPy and how it can represent different types of data (tables, images, text…etc) before we can serve them to machine learning models.”
JAY ALAMMAR • Shared by Python Bytes FM

Develop Data Visualization Interfaces in Python With Dash

Learn how to build a dashboard using Python and Dash. Dash is a framework for building data visualization interfaces. It helps data scientists build fully interactive web applications quickly.
REAL PYTHON

Reinventing the Python Logo: Interview With a UI Designer

UI designer Jessica Williamson redesigns the Python logo as a hobby project and receives 7000 upvotes on Reddit. Here’s an interview with her.
CARLO OCCHIENA

PyCon US 2021: Call for Proposals Is Open

PYCON.BLOGSPOT.COM

Discussions

Anaconda Is Not Free for Commercial Use (Anymore)?

Anaconda’s CEO responds on the thread: “At this time, there is no prohibition on using Anaconda Individual Edition in a small-scale commercial setting like yours.” Related discussion on Twitter.
REDDIT

Best IDE for Python

“Right now I am using the standard Python IDLE. I am new to Python and want something better. I have seen a few but they all look too complicated.”
REDDIT

Python Jobs

alt

How Strong Is Your Resume?

sponsor

Get a free, confidential review from a resume expert →

Entry-Level Engineering Programme (London, UK)

Tessian

Senior Backend Engineer (London, UK or Remote)

Tessian

Backend Engineer (London, UK or Remote)

Tessian

Backend Engineer (Berlin, Germany)

Feather

More Python Jobs >>>

Articles & Tutorials

What Is Data Engineering and Researching 10 Million Jupyter Notebooks

“Are you familiar with the role data engineers play in the modern landscape of data science and Python? Data engineering is a sub-discipline that focuses on the transportation, transformation, and storage of data. This week on the show, David Amos is back, and he’s brought another batch of PyCoder’s Weekly articles and projects.”
REAL PYTHON podcast

Django Migrations Without Downtimes [2015]

“Applying migrations on a live system can bring down your web-server in counter-intuitive ways. I’ll talk about common schema change scenarios and how those can be safely carried out on a live system with a Postgres database. We’ll look at locking and timing issues, multi-phase deployments and migration system peculiarities.”
LUDWIG HÄHNE

Automating Code Performance Testing Now Possible

Performance is a feature, test it as such. Test performance in CI/CD. Validate production deploys. Blackfire offers a robust way to run test scenarios and validate code changes, automatically. Discover Blackfire Builds now. Free 15 days trial.
BLACKFIRE sponsor

Quick Way to Find and Fix Invalid Values in Numerical Data Columns

Real world data sets often include invalid data values. Investigating them can be difficult, since attempting to convert them to the correct types causes exceptions. In this article you’ll take a look at a “real world” messy data set and learn a quick trick to summarize and fix invalid data values.
DRAWINGFROMDATA.COM • Shared by Martin

Robust Web Scraping or Web API Based Data Collection

“Large scale data collection via web scraping or web APIs must run reliably over days or even weeks. This brings up problems that mainly focus on the robustness of the data collection process. I will try to tackle some of these problems in this post.”
MARKUS KONRAD • Shared by Markus Konrad

Building ML Teams and Finding ML Jobs

“Are you building or running an internal machine learning team? How about looking for a new ML position? On this episode, I talk with Chip Huyen from Snorkel AI about building ML teams, finding ML positions, and teach ML at Stanford.”
TALK PYTHON podcast

Indexing and Selecting in Pandas by Callable

“In Pandas, you can use callables where indexers are accepted. It turns out that can be handy for a pretty common use case.”
MATT WRIGHT

Hacking QR Code Design

“How to create QR codes that look like anything by inverting the QR creation process.” (Python source code included.)
MARIEN RAAT

Understand Django: Serving Static Files

Static files are critical to apps, but have little to do with Python code. See what they are and what they do.
MATT LAYMAN • Shared by Matt Layman

float vs decimal in Python

Learn the differences between floats and decimals in Python, common issues, and when to use each.
STEVEN PATE • Shared by Steven Pate

How to Make a Violin Plot in Python Using Matplotlib and Seaborn

ERIK MARSJA

The Easiest Way to Rename a Column in Pandas

Two easy recipes for renaming column(s) in a Pandas DataFrame.
BEN COOK

Switch/Case in Python

MATTEO GUADRINI

Event-Driven: Architecture Lessons Learned in Building a Poker Platform With Event Sourcing

MAX MCCREA • Shared by Max

Generate File Reports Using Python’s string.Template

FLORIAN DAHLITZ • Shared by Florian Dahlitz

Projects & Code

Learn X by Doing Y: Project-Based Learning Search Engine

AQUADZN.GITHUB.IO

fontpreview: Python Library for Font Previews

GITHUB.COM/MATTEOGUADRINI

aioauth: Asynchronous OAuth 2.0 Framework and Provider for Python 3

GITHUB.COM/ALIEV

funct: Like a Python List but Better

GITHUB.COM/LAURIAT

Thonny: Hassle-Free Python Micro-IDE

THONNY.ORG

mutmut: Mutation Testing System

GITHUB.COM/BOXED

Scipy Lecture Notes: Tutorial Material on the Scientific Python Ecosystem

SCIPY-LECTURES.ORG

fpdf2: Simple PDF Generation for Python

GITHUB.COM/PYFPDF

Events

Real Python Office Hours (Virtual)

January 20, 2020
REALPYTHON.COM

BelPy

January 30 – 31, 2021
BELPY.IN

PyCascades 2021 (Virtual)

February 19 – 21, 2021
PYCASCADES.COM

PyCon 2021 (Virtual)

May 12 – 18, 2021
PYCON.ORG


Happy Pythoning!
This was PyCoder’s Weekly Issue #455.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

January 12, 2021 07:30 PM UTC


Python Software Foundation

2020 in Review

Image of someone holding a lit sparkler
The beginning of 2020 was paving a new way for the PSF to support its community. The PSF Board Directors strategically planned to devote funding to Python's core and to hiring staff. We expected a healthy revenue from PyCon US 2020 and had big plans for it! 

As the pandemic hit and PyCon US 2020 in Pittsburgh was cancelled, we had to shift our strategic plan and so much more. The PSF quickly had to reassess its programs, plans, and rebuild.

But we've had our fill of bad news! Instead of highlighting the negative impacts of 2020 (we touch on that here), let's take a look at all the positivity we witnessed and experienced within the Python community throughout 2020.

Our staff and volunteers have their work cut out for them in 2021. In addition to hiring a Director of Resource Development, we will be hiring a Developer-in-Residence to assist core development (expect an announcement soon!), and we will be forming a Community Leadership Council. Thank you to all volunteers, donors, and sponsors for allowing us to continue our work. 

We wish everyone a very healthy and happy 2021!

January 12, 2021 04:36 PM UTC


Real Python

Managing Python Dependencies

Managing Python Dependencies is your “one-stop shop” for picking up modern Python dependency management practices and workflows with minimal time investment.

The course consists of 32 bite-sized video lessons, each focusing on a single concept. Progressing through the course, you’ll quickly build up a comprehensive knowledge of dependency management best practices in Python at your own, comfortable pace.

Along the way, you’ll see hands on examples and step-by-step workflows that reinforce the skills you’re learning.

By the end, you’ll know how to apply Python’s recommended dependency management tools, like pip, virtualenvs, and requirements files effectively in the most common day-to-day development scenarios on Linux, macOS, and Windows.

With Managing Python Dependencies you will:

Who Is This Course For?

This course is for Python developers wanting to break through to the next phase of developing code by becoming more efficient, productive, and skilled using Python’s rich library ecosystem.

If you’ve ever caught yourself thinking “There’s got to be a Python package out there that does exactly what I want…But how do I find it?” this course will fill in the missing pieces for you.

Discover the industry best practices around choosing and managing third-party dependencies for your Python 2 or Python 3 projects on Windows, macOS, and Linux.

If you already know how to use alternative package managers like Conda you’ll discover how to use the standards-compliant tools and workflows supported by any Python distribution and used in most production application deployments.

Course Goals

By the end of the course you’ll know how to:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

January 12, 2021 02:00 PM UTC


Stack Abuse

How to Randomly Select Elements From a List in Python

Introduction

Selecting a random element or value from a list is a common task - be it for randomized result from a list of recommendations or just a random prompt.

In this article, we'll take a look at how to randomly select elements from a list in Python. We'll cover the retrieval of both singular random elements, as well as retrieving multiple elements - with and without repetition.

Selecting a Random Element From Python List

The most intuitive and natural approach to solve this problem is to generate a random number that acts as an index to access an element from the list.

To implement this approach, let's look at some methods to generate random numbers in Python: random.randint() and random.randrange(). We can additionally use random.choise() and supply an iterable - which results in a random element from that iterable being returned back.

Using random.randint()

random.randint(a, b) returns a random integer between a and b inclusive.

We'll want random index in the range of 0 to len(list)-1, to get a random index of an element in the list:

import random

letters = ['a', 'b', 'c', 'd', 'e', 'f']
random_index = random.randint(0,len(letters)-1)

print(letters[random_index])

Running this code multiple times yields us:

e
c
f
a

Using random.randrange()

random.randrange(a) is another method which returns a random number n such that 0 <= n < a:

import random

letters = ['a', 'b', 'c', 'd', 'e', 'f']
random_index = random.randrange(len(letters))

print(letters[random_index])

Running this code multiple times will produce something along the lines of:

f
d
d
e

As random.randrange(len(letters)) returns a randomly generated number in the range 0 to len(letters) - 1, we use it to access an element at random in letters, just like we did in the previous approach.

This approach is a tiny bit simpler than the last, simply because we don't specify the starting point, which defaults to 0.

Using random.choice()

Now, an even better solution than the last would be to use random.choice() as this is precicely the function designed to solve this problem:

import random 

letters = ['a', 'b', 'c', 'd', 'e', 'f']

print(random.choice(letters))

Running this multiple times results in:

b
e
e
f
e

Selecting More Than One Random Element From Python List

Using random.sample()

The first method that we can make use of to select more than one element at random is random.sample(). It produces a sample, based on how many samples we'd like to observe:

import random 

letters = ['a', 'b', 'c', 'd', 'e', 'f']

print(random.sample(letters, 3))

This returns a list:

['d', 'c', 'a']

This method selects elements without replacement, i.e., it selects without duplicates and repetitions.

If we run this:

print(random.sample(letters, len(letters)))

Since it doesn't return duplicates, it'll just return our entire list in a randomized order:

['a', 'e', 'c', 'd', 'f', 'b']

Using random.choices()

Similar to the previous function, random.choices() returns a list of randomly selected elements from a given iterable. Though, it doesn't keep track of the selected elements, so you can get duplicate elements as well:

import random 

letters = ['a', 'b', 'c', 'd', 'e', 'f']

print(random.choices(letters, k=3))

This returns something along the lines of:

['e', 'f', 'f']

Also, if we run:

print(random.choices(letters, k = len(letters)))

It can return something like:

['d', 'e', 'b', 'd', 'd', 'd']

random.choices returns a k-sized list of elements selected at random with replacement.

This method can also be used implement weighted random choices which you can explore further in the official Python documentation.

Conclusion

In this article, we've explored several ways to retrieve one or multiple randomly selected elements from a List in Python.

We've accessed the list in random indices using randint() and randrange(), but also got random elements using choice() and sample().

January 12, 2021 01:30 PM UTC


Python Pool

Python dateutil Module: Explanation and Examples

Hello coders!! In this article, we will understand the python dateutil module in detail. Modules in Python are files that contain certain statements and definitions. These modules are used to break down a large program into small parts. They are imported into programs to avail their functions. Dateutil is one such predefined module in python used when we need to do certain date and time-related work. Without wasting any time, let’s get straight into the topic.

What is the dateutil module in Python?

There is a built-in module available in python called the datetime module. We use this module for the manipulation of dates and times from simpler to complex ways. Even though this module is sufficient for a certain number of instances, the dateutil module in python provides certain powerful extensions to the datetime module.

Key features of the dateutil Module in Python:

Installation of Python dateutil Module:

Like any other module, dateutil can be installed from PyPI using pip.

pip install python-dateutil

Now that we have successfully installed the dateutil module let us explore some of its features with the help of certain examples.

Let us first import the required modules:

from datetime import *
from dateutil.relativedelta import *
import calendar

Example 1: To get the current date & time in Python

current = datetime.now()
print("Date\t   Time")
print(current)

Output & Explanation:

To get current date & time in Python using dateutilOutput

In this code, we used the datetime.now() method to get the current date and the time. By default, the function returns the Greenwich Meridian time.

Example 2: To get the date of next week or next month

month_next = date.today() + relativedelta(months=+1)
print('Date of next month:')
print(month_next)
week_next = date.today() + relativedelta(weeks=+1)
print('Date of next week:')
print(week_next)

Output & Explanation:

To get the date of next week or next month using dateutilOutput

Here, we used the relativedelta method to get the date of next month and next week. As we can see, when the month value is increased by one, the date generated is of the next month. Similarly, we have obtained the date of next week as well.

Example 3: Combining the dates(next month plus one week)

combined = date.today() + relativedelta(months=+1, weeks=+1)
print('Date of next moht and one week:')
print(combined)

Output & Explanation:

Combining the dates(next month plus one week)Output

For this example, we wanted to obtain the date of next month’s next week. So, we incremented the value of both months and weeks by 1. As a result, we got the date of next month and a week.

Example 4: Going backward Using relativedelta

month_last = date.today() + relativedelta(months=-1)
print('Date of last month:')
print(month_last)
week_last = date.today() + relativedelta(weeks=-1)
print('Date of last week:')
print(week_last)

Output & Explanation:

Going backward Using relativedeltaOutput

Just like we got the date of months ahead of us, we can also go back to get the date of month or week before the current one. We need to do that instead of increasing. We decreased the value of the relativedelta method to go backward.

Example 5: Getting a particular weekday Using relativedelta() in Python

friday= date.today() + relativedelta(weekday=FR)
print(friday)

Output & Explanation:

Getting a particular weekday Using relativedelta() in PythonOutput

If someone wants to get the date of a particular weekday in the given week, that is also possible. All they have to do is to specify the day, and the corresponding date is generated.

Example 6: Get difference between two dates

dob = datetime(1999, 5, 15)
my_age = relativedelta(date.today(), dob)
print(my_age)

Output & Explanation:

Get difference between two dates in pythonOutput

In this example, we have used the the feature of dateutil module where we can get the difference between two dates. So we entered a particular date of birth and used this property of the relative delta method to get the current age.

You might be also interested in reading:

Conclusion: dateutil module in python

These are the various examples showcasing the relative delta computation in python.

The post Python dateutil Module: Explanation and Examples appeared first on Python Pool.

January 12, 2021 09:04 AM UTC