skip to navigation
skip to content

Planet Python

Last update: May 06, 2021 04:41 PM UTC

May 06, 2021


Stack Abuse

Creating PDF Invoices in Python with pText

Introduction

The Portable Document Format (PDF) is not a WYSIWYG (What You See is What You Get) format. It was developed to be platform-agnostic, independent of the underlying operating system and rendering engines.

To achieve this, PDF was constructed to be interacted with via something more like a programming language, and relies on a series of instructions and operations to achieve a result. In fact, PDF is based on a scripting language - PostScript, which was the first device-independent Page Description Language.

In this guide, we'll be using pText - a Python library dedicated to reading, manipulating and generating PDF documents. It offers both a low-level model (allowing you access to the exact coordinates and layout if you choose to use those) and a high-level model (where you can delegate the precise calculations of margins, positions, etc to a layout manager).

We'll take a look at how to create a PDF invoice in Python using pText.

Installing pText

pText can be downloaded from source on GitHub, or installed via pip:

$ pip install ptext-joris-schellekens

Creating a PDF Invoice in Python with pText

pText has two intuitive key classes - Document and Page, which represent a document and the pages within it. Additionally, the PDF class represents an API for loading and saving the Documents we create.

Let's create a Document() and Page() as a blank canvas that we can add the invoice to:

from ptext.pdf.document import Document
from ptext.pdf.page.page import Page

# Create document
pdf = Document()

# Add page
page = Page()
pdf.append_page(page)

Since we don't want to deal with calculating coordinates - we can delegate this to a PageLayout which manages all of the content and its positions:

# New imports
from ptext.pdf.canvas.layout.page_layout import SingleColumnLayout
from ptext.io.read.types import Decimal

page_layout = SingleColumnLayout(page)
page_layout.vertical_margin = page.get_page_info().get_height() * Decimal(0.02)

Here, we're using a SingleColumnLayout since all of the content should be in a single column - we won't have a left and right side of the invoice. We're also making the vertical margin smaller here. The default value is to trim the top 10% of the page height as the margin, and we're reducing it down to 2%, since we'll want to use this space for the company logo/name.

Speaking of which, let's add the company logo to the layout:

# New import
from ptext.pdf.canvas.layout.image import Image

page_layout.add(    
        Image(        
        "https://s3.amazonaws.com/s3.stackabuse.com/media/articles/creating-an-invoice-in-python-with-ptext-1.png",        
        width=Decimal(128),        
        height=Decimal(128),    
        ))

Here, we're adding an element to the layout - an Image(). Through its constructor, we're adding a URL pointing to the image resource and setting its width and height.

Beneath the image, we'll want to add our imaginary company info (name, address, website, phone) as well as the invoice information (invoice number, date, due date). A common format for brevity (which incidentally also makes the code cleaner) is to use a table to store invoice data. Let's create a separate helper method to build the invoice information in a table, which we can then use to simply add a table to the invoice in our main method:

# New imports
from ptext.pdf.canvas.layout.table import Table
from ptext.pdf.canvas.layout.paragraph import Paragraph, Alignment
from datetime import datetime
import random

def _build_invoice_information():    
    table_001 = Table(number_of_rows=5, number_of_columns=3)
	
    table_001.add(Paragraph("[Street Address]"))    
    table_001.add(Paragraph("Date", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT))    
    now = datetime.now()    
    table_001.add(Paragraph("%d/%d/%d" % (now.day, now.month, now.year)))
	
    table_001.add(Paragraph("[City, State, ZIP Code]"))    
    table_001.add(Paragraph("Invoice #", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT))
    table_001.add(Paragraph("%d" % random.randint(1000, 10000)))   
	
    table_001.add(Paragraph("[Phone]"))    
    table_001.add(Paragraph("Due Date", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT))
    table_001.add(Paragraph("%d/%d/%d" % (now.day, now.month, now.year))) 
	
    table_001.add(Paragraph("[Email Address]"))    
    table_001.add(Paragraph(" "))
    table_001.add(Paragraph(" "))

    table_001.add(Paragraph("[Company Website]"))
    table_001.add(Paragraph(" "))
    table_001.add(Paragraph(" "))

    table_001.set_padding_on_all_cells(Decimal(2), Decimal(2), Decimal(2), Decimal(2))    		
    table_001.no_borders()
    return table_001

Here, we're making a simple Table with 5 rows and 3 columns. The rows correspond to the street address, city/state, phone, email address and company website. Each row will have 0..3 values (columns). Each text element is added as a Paragraph, which we've aligned to the right via Alignment.RIGHT, and accept styling arguments such as font.

Finally, we've added padding to all the cells to make sure we don't place the text awkwardly near the confounds of the cells.

Now, back in our main method, we can call _build_invoice_information() to populate a table and add it to our layout:

page_layout = SingleColumnLayout(page)
page_layout.vertical_margin = page.get_page_info().get_height() * Decimal(0.02)
page_layout.add(    
    Image(        
        "https://s3.amazonaws.com/s3.stackabuse.com/media/articles/creating-an-invoice-in-python-with-ptext-1.png",        
        width=Decimal(128),        
        height=Decimal(128),    
        ))

# Invoice information table  
page_layout.add(_build_invoice_information())  
  
# Empty paragraph for spacing  
page_layout.add(Paragraph(" "))

Now, let's build this PDF document real quick to see what it looks like. For this, we'll use the PDF module:

# New import
from ptext.pdf.pdf import PDF

with open("output.pdf", "wb") as pdf_file_handle:
    PDF.dumps(pdf_file_handle, document)

ptext invoice 1

Great! Now we'll want to add the billing and shipping information as well. It'll conveniently be placed in a table, just like the company information. For brevity's sake, we'll also opt to make a separate helper function to build this info, and then we can simply add it in our main method:

# New imports
from ptext.pdf.canvas.color.color import HexColor, X11Color

def _build_billing_and_shipping_information():  
    table_001 = Table(number_of_rows=6, number_of_columns=2)  
    table_001.add(  
        Paragraph(  
            "BILL TO",  
            background_color=HexColor("263238"),  
            font_color=X11Color("White"),  
        )  
    )  
    table_001.add(  
        Paragraph(  
            "SHIP TO",  
            background_color=HexColor("263238"),  
            font_color=X11Color("White"),  
        )  
    )  
    table_001.add(Paragraph("[Recipient Name]"))        # BILLING  
    table_001.add(Paragraph("[Recipient Name]"))        # SHIPPING  
    table_001.add(Paragraph("[Company Name]"))          # BILLING  
    table_001.add(Paragraph("[Company Name]"))          # SHIPPING  
    table_001.add(Paragraph("[Street Address]"))        # BILLING  
    table_001.add(Paragraph("[Street Address]"))        # SHIPPING  
    table_001.add(Paragraph("[City, State, ZIP Code]")) # BILLING  
    table_001.add(Paragraph("[City, State, ZIP Code]")) # SHIPPING  
    table_001.add(Paragraph("[Phone]"))                 # BILLING  
    table_001.add(Paragraph("[Phone]"))                 # SHIPPING  
    table_001.set_padding_on_all_cells(Decimal(2), Decimal(2), Decimal(2), Decimal(2))  
    table_001.no_borders()  
    return table_001

We've set the background_color of the initial paragraphs to #263238 (grey-blue) to match the color of the logo, and the font_color to White.

Let's call this in the main method as well:

# Invoice information table
page_layout.add(_build_invoice_information())

# Empty paragraph for spacing
page_layout.add(Paragraph(" "))

# Billing and shipping information table
page_layout.add(_build_billing_and_shipping_information())

Once we run the script again, this results in a new PDF file that contains more information:

ptext invoice 2

With our basic information sorted out (company info and billing/shipping info) - we'll want to add an itemized description. These will be the goods/services that our supposed company offered to someone and are also typically done in a table-like fashion beneath the information we've already added.

Again, let's create a helper function that generates a table and populates it with data, which we can simply add to our layout later on:

# New import
from ptext.pdf.canvas.layout.table import Table, TableCell

def _build_itemized_description_table(self):  
    table_001 = Table(number_of_rows=15, number_of_columns=4)  
    for h in ["DESCRIPTION", "QTY", "UNIT PRICE", "AMOUNT"]:  
        table_001.add(  
            TableCell(  
                Paragraph(h, font_color=X11Color("White")),  
                background_color=HexColor("016934"),  
            )  
        )  
  
    odd_color = HexColor("BBBBBB")  
    even_color = HexColor("FFFFFF")  
    for row_number, item in enumerate([("Product 1", 2, 50), ("Product 2", 4, 60), ("Labor", 14, 60)]):  
        c = even_color if row_number % 2 == 0 else odd_color  
        table_001.add(TableCell(Paragraph(item[0]), background_color=c))  
        table_001.add(TableCell(Paragraph(str(item[1])), background_color=c))  
        table_001.add(TableCell(Paragraph("$ " + str(item[2])), background_color=c))  
        table_001.add(TableCell(Paragraph("$ " + str(item[1] * item[2])), background_color=c))  
	  
	# Optionally add some empty rows to have a fixed number of rows for styling purposes
    for row_number in range(3, 10):  
        c = even_color if row_number % 2 == 0 else odd_color  
        for _ in range(0, 4):  
            table_001.add(TableCell(Paragraph(" "), background_color=c))  
  
    table_001.add(TableCell(Paragraph("Subtotal", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT,), col_span=3,))  
    table_001.add(TableCell(Paragraph("$ 1,180.00", horizontal_alignment=Alignment.RIGHT)))  
    table_001.add(TableCell(Paragraph("Discounts", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT,),col_span=3,))  
    table_001.add(TableCell(Paragraph("$ 177.00", horizontal_alignment=Alignment.RIGHT)))  
    table_001.add(TableCell(Paragraph("Taxes", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT), col_span=3,))  
    table_001.add(TableCell(Paragraph("$ 100.30", horizontal_alignment=Alignment.RIGHT)))  
    table_001.add(TableCell(Paragraph("Total", font="Helvetica-Bold", horizontal_alignment=Alignment.RIGHT  ), col_span=3,))  
    table_001.add(TableCell(Paragraph("$ 1163.30", horizontal_alignment=Alignment.RIGHT)))  
    table_001.set_padding_on_all_cells(Decimal(2), Decimal(2), Decimal(2), Decimal(2))  
    table_001.no_borders()  
    return table_001

In practice, you'd substitute the hard-coded strings related to the subtotal, taxes and total prices with calculations of the actual prices - though, this heavily depends on the underlying implementation of your Product models, so we've added a stand-in for abstraction. Once we add this table to the document as well - we can rebuild it and take a look.

The entire main method should now look something along the lines of:

# Create document
pdf = Document()

# Add page
page = Page()
pdf.append_page(page)

page_layout = SingleColumnLayout(page)
page_layout.vertical_margin = page.get_page_info().get_height() * Decimal(0.02)

page_layout.add(
        Image(
        "https://s3.amazonaws.com/s3.stackabuse.com/media/articles/creating-an-invoice-in-python-with-ptext-1.png",
        width=Decimal(128),
        height=Decimal(128),
        ))


# Invoice information table
page_layout.add(_build_invoice_information())

# Empty paragraph for spacing
page_layout.add(Paragraph(" "))

# Billing and shipping information table
page_layout.add(_build_billing_and_shipping_information())

# Itemized description
page_layout.add(_build_itemized_description_table())

with open("output2.pdf", "wb") as pdf_file_handle:
    PDF.dumps(pdf_file_handle, pdf)

Running this piece of code results in:

ptext invoice 3

Creating an Outline

Our PDF is done and ready to be served - though, we can take it up a notch with two little additions. First, we can add an Outline, which helps readers like Adobe navigate and generate a menu for your PDFs:

# New import
from ptext.pdf.page.page import DestinationType

# Outline  
pdf.add_outline("Your Invoice", 0, DestinationType.FIT, page_nr=0)

The add_outline() function accepts a few arguments:

Destinations can be thought of as targets for hyperlinks. You can link to an entire page (which is what we are doing in this example), but you can also link to specific parts of a page (for instance - exactly at y-coordinate 350).

Furthermore, you need to specify how the reader should present that page - for instance, do you want to simply scroll to that page and not zoom? Do you want to display only a target area, with the reader completely zoomed into that particular area?

In this line of code, we are asking the reader to display page 0 (the first page) and ensure it fits the reader window (zooming in/out if needed).

Once you've added the outline, you should see it appear in the reader of your choice:

ptext invoice 4

With multiple pages - you can create a more complex outline and link to them via add_outline() for easier navigation.

Embedding JSON Documents in PDF Invoices

Since PDFs aren't very computer-friendly (in terms of reading and unambiguously decoding) - sometimes, we might want to add more computer-friendly formats as well if someone would like to process invoices automatically.

A Germany-originating invoice standard called ZUGFeRD (later adopted by the EU) enables us to make PDF invoices with more computer-legible file formats such as XML - which describes the invoice and is easily parsable. In addition to these, you can also embed other documents related to your invoice such as terms and agreements, a refund policy, etc.

To embed any sort of additional file in a PDF file, using pText - we can use the append_embedded_file() function.

Let's first go ahead and create a dictionary to store our invoice data in JSON, which we'll then save into an invoice_json file:

import json

# Creating a JSON file
invoice_json = {  
"items": [  
    {  
        "Description": "Product1",  
        "Quantity": 2,  
        "Unit Price": 50,  
        "Amount": 100,  
    },  
    {  
        "Description": "Product2",  
        "Quantity": 4,  
        "Unit Price": 60,  
        "Amount": 100,  
    },  
    {  
        "Description": "Labor",  
        "Quantity": 14,  
        "Unit Price": 60,  
        "Amount": 100,  
    },  
],  
"Subtotal": 1180,  
"Discounts": 177,  
"Taxes": 100.30,  
"Total": 1163.30,  
}  
invoice_json_bytes = bytes(json.dumps(invoice_json, indent=4), encoding="latin1")

Now, we can simply embed this file into our PDF invoice:

pdf.append_embedded_file("invoice.json", invoice_json_bytes, apply_compression=False)

Once we run the script again and store the document, we've go:

ptext invoice 5

Conclusion

In this guide, we've taken a look at how to create an invoice in Python using pText. We've then added an outline to the PDF file for ease of navigation and taken a look at how to add attachments/embedded files for programmatic access to the contents of the PDF.

May 06, 2021 01:37 PM UTC


Python for Beginners

Python String Methods for String Manipulation

String Manipulation is the most essential skill when you are analyzing text data. Python has many built in methods for string manipulation.  In this article, we will study the most frequently used python string methods for string manipulation.

Python String Method to Capitalize a word

To capitalize the first letter of a string in python we use capitalize() method. capitalize() method returns a new string in which the first letter of the string is capitalized. In this process, No change is made to the original string.

Example:

myString="python"
print("Original String:")
print(myString)
newString=myString.capitalize()
print("New modified string:")
print(newString)
print("original string after modification:")
print(myString)

Output:

Original String:
python
New modified string:
Python
original string after modification:
python

In the output, we can see that first letter of the new string has been modified but no changes have been made to string on which the method was invoked.

Python String Method to capitalize first character of each word

To convert first character of each word into capital letter, we can use title() method. When invoked on a string, it capitalizes first character of each word of input string and returns a new string with the result. It doesn’t affects the original string.

Example:

myString="Python is a great language" 
newString=myString.title()
print("Original string is:")
print(myString)
print("Output is:")
print(newString)

Output:

Original string is:
Python is a great language
Output is:
Python Is A Great Language

How to convert string into lowercase in Python?

casefold() method returns a new string when invoked on a python string and turns every letter of original string into lowercase. It doesn’t change the original string. This python string method can be used for preprocessing the text if it contains irregular use of capital or small letters.

Example:

myString="PytHon"
print("Original String:")
print(myString)
newString=myString.casefold()
print("New modified string:")
print(newString)
print("original string after modification:")
print(myString)

Output:

Original String:
PytHon
New modified string:
python
original string after modification:
PytHon

Another method to convert strings into lowercase is the lower() method. It also coverts letters in the text string into lowercase and returns a new string.

Example:

myString="PytHon"
print("Original String:")
print(myString)
newString=myString.lower()
print("New modified string:")
print(newString)
print("original string after modification:")
print(myString)

Output:

Original String:
PytHon
New modified string:
python
original string after modification:
PytHon

How to convert Strings into uppercase in Python?

We can use upper() method to convert an input string into uppercase. When the upper() method is invoked on any string, it returns a new string with all the letter capitalized. It doesn’t change the original string.

Example:

myString="PytHon"
print("Original String:")
print(myString)
newString=myString.upper()
print("New modified string:")
print(newString)
print("original string after modification:")
print(myString)

Output:

Original String:
PytHon
New modified string:
PYTHON
original string after modification:
PytHon

There is another method named swapcase() which swaps the case of every letter in the input string and returns a new string. It doesn’t make any change to the input string on which it is invoked.

Example:

myString="PytHon"
print("Original String:")
print(myString)
newString=myString.swapcase()
print("New modified string:")
print(newString)
print("original string after modification:")
print(myString)

Output:

Original String:
PytHon
New modified string:
pYThON
original string after modification:
PytHon

How to Split Strings in python?

To split strings in python, we use split() method. Python split method takes an optional separator and splits the input string at the points where a separator is present and returns a list containing the splitted parts of the string.

Example:

myString="I am A Python String"
print("Original String:")
print(myString)
newList=myString.split()
print("New List:")
print(newList)
print("when 'A' is declared as separator:")
aList=myString.split("A")
print(aList)
print("original string after modification:")
print(myString)

Output:

Original String:
I am A Python String
New List:
['I', 'am', 'A', 'Python', 'String']
when 'A' is declared as separator:
['I am ', ' Python String']
original string after modification:
I am A Python String

If we want to split a string for certain number of times, we can use rsplit() method instead of split() method. rsplit() method takes an extra argument named maxsplit which is the number of times the string has to be split. The input string is split at maxsplit places from right side of the string and a list with maxsplit+1 fragments of the input string is returned by the rsplit() method. If no value is passed to maxsplit argument, the rsplit() method works the same as split() method.

Example:

myString="I am A Python String"
print("Original String:")
print(myString)
newList=myString.rsplit()
print("New List without maxsplit:")
print(newList)
print("when maxsplit is set at 2:")
aList=myString.rsplit(maxsplit=2)
print(aList)
print("original string after modification:")
print(myString)

Output:

Original String:
I am A Python String
New List without maxsplit:
['I', 'am', 'A', 'Python', 'String']
when maxsplit is set at 2:
['I am A', 'Python', 'String']
original string after modification:
I am A Python String

How to Concatenate Strings in Python?

Now that we have seen how to split the strings, it may be a case that we need to perform string concatenation in python. We can concatenate two strings using "+" operator as well as using join() method.

While using "+" operator, we just add the different strings using "+" operator and assign it to a new string. Here we can concatenate any number of strings in single statements by using "+" operator between them.

Example:

myString1="I am a "
print ("first string is:")
print(myString1)
myString2="Python String"
print("Second String is:")
print(myString2)
myString=myString1+myString2
print("Conactenated string is:")
print(myString)

Output:

myString1="I am a "
print ("first string is:")
print(myString1)
myString2="Python String"
print("Second String is:")
print(myString2)
myString=myString1+myString2
print("Conactenated string is:")
print(myString)

We can also concatenate strings using join method in python. Join method is invoked on a string which is separator and a list or any other iterable of strings is passed to it for joining. It returns a new string containing words in the iterable, separated by the separator string.

Example:

myStringList=["I","am","a","python","string"]
print ("list of string is:")
print(myStringList)
separator=" "#space is used as separator
myString=separator.join(myStringList)
print("Concatenated string is:")
print(myString)

Output:

list of string is:
['I', 'am', 'a', 'python', 'string']
Concatenated string is:
I am a python string

How to trim Strings in Python?

There may be chances that strings are containing extra spaces in the start or end. We can remove those spaces using python string methods namely strip(), lstrip() and rstrip().

lstrip() method removes spaces from start of the input string and returns a new string.

rstrip() removes spaces from the end of the string and returns a new string.

strip() method removes spaces from both the start and end of the input string and returns a new string.

Example:

myString="          Python          " 
lstring=myString.lstrip()
rstring=myString.rstrip()
string =myString.strip()
print("Left Stripped string is:",end="")
print(lstring)
print("Right Stripped string is:",end="")
print(rstring)
print("Totally Stripped string is:",end="")
print(string)

Output:

Left Stripped string is:Python          
Right Stripped string is:          Python
Totally Stripped string is:Python

Python String Methods to split a string at newlines.

By using splitlines() method in python, we can convert a string into a list of sentences. The function splits the input string at linebreaks or newlines and returns a new list containing all the fragments of the input string.

Example:

myString="Python is a great language.\n I love python" 
slist=myString.splitlines()
print("Original string is:")
print(myString)
print("Output is:")
print(slist)

Output:

Original string is:
Python is a great language.
I love python
Output is:
['Python is a great language.', ' I love python']

Conclusion

In this article, we have seen python string methods to manipulate the string data in python. We have seen how to split, strip, and concatenate the strings using different methods. We have also seen how to change case of letters in the strings. Stay tuned for more informative articles.

The post Python String Methods for String Manipulation appeared first on PythonForBeginners.com.

May 06, 2021 11:56 AM UTC


PyCharm

Webinar overview “Building Your First Python Slackbot”

Last week Nafiul Islam invited an expert guest, Mason Egger from DigitalOcean, to do an in-depth coding session on Slackbot creation. Mason walked Nafiul through all the small details that are sometimes not so obvious to those who are dealing with Slackbots for the first time.

Along the way, Mason covered a lot of related topics, such as how Slackbots work, creating virtual environments and why you might want to exclude them from your project, what Droplets are and what you can use instead of them, and even how Discord bot creation is different from Slackbot creation.

WATCH THE WEBINAR

Watch the webinar to enjoy the engaging discussion between Mason and Nafiul and to learn how to create a Slackbot of your own!

Written versions of material covered in this live coding session are also available:

Other materials you might be interested in:

May 06, 2021 10:32 AM UTC


Codementor

Knowledge Graph — A Powerful Data Science Technique to Mine Information from Text (with Python code)

Learn to extract valuable information from unstructured text data using Knowledge Graphs.

May 06, 2021 09:51 AM UTC


Python Pool

Simplifying If Not Logical Operator in Python

If not Python is a logical operator which is present in pre-built statements. Operators help in performing different functions or actions that are important to get a result. Python allows us to perform different actions.

What is an If Not Python Operation?

Python If Not Operator is a combination of ‘if’ and ‘not’ statements. ‘if’ keyword allows you to check if the condition of the next statements is True or False and then act accordingly. Whereas the ‘not’ keyword negates the value of boolean of statements in front of it. Moreover, the ‘if not’ statement allows you to avoid the ‘else’ keyword many times.

In this topic, we will be discussing the if-not operator. It can operate on boolean, string, list, dictionary, and sets. We can obtain negative values using the operator. This means that when we use “if not” and the output is true, then the result is shown as false. The same happens for vice-versa. In other words, if the value is false the not value will be true and if the value is true then the not value will be false.

This entire situation happens for the Boolean type. Let us take a string expression. Here the operator will only function when string values are empty. This is similar to the list as well. If the list is empty, only then the operator can perform.

Some of operators are:

How do you Write if not Condition in Python?

The following syntax refers to how the interpreter expects to parse if not condition in your program:

if<space>not<space>condition:
<indentation>statement

# Example:

if not x==5:
    print("x is not 5")

Source: Python Official Documentation

Syntax

if not value:
    statement(s)

Parameter

Not applicable.

Examples of If Not Statement in Python

1. If Not Python on Boolean

Boolean values are usually 0 and 1.Zero can be written as false and 1 means true in a programming code.

SYNTAX

CHECK = False

if not CHECK:
       print('false.')
 
CHECK = 5

if not CHECK==5:
       print('CHECK is not 5')
else:
       print('CHECK is 5')

OUTPUT

false
CHECK is 5

2. If Not Python on String

String refers to the set of zeros or sequence of text. We store strings in double-quotes.

SYNTAX

string = ""
 
if not string:
    print('String is empty.')
else:
    print(string)

OUTPUT

String is empty

3. If Not on a List

The list can store multiple data in a single variable. We place the list of characters in square brackets. The list is usually very versatile in Python as it has multiplet set of data of varied types.

SYNTAX

check= []
 
if not check:
    print('List is empty.')
else:
    print(check)

OUTPUT

List is empty

4. If Not Python on Dictionary

It is a collection of unordered and changed data. The dictionary uses index value to recognize the data. It uses the logical operator to check if the dictionary is empty. Dictionary values contain keys and values. We write them in braces.

SYNTAX

val = dict({})

if not val:
    print('Dictionary is empty.')
else:
    print(val)

OUTPUT

Dictionary is empty.

5. If Not on Set

Set is different from the dictionary. It is a collection of unordered data. They do not contain any index value. We can write to them in curly braces. We cannot use the index using a key or index value. The values in the set are usually used along with the loop. The set allows us to add and update values.

SYNTAX

val = set({})

if not val:
    print('Set is empty.')
else:
    print(a)

OUTPUT

Set is empty.

6. If Not Python on Tuple

The tuples are same as that of lists. They are a finite list of values. The sequence is in order and immutable.

SYNTAX

Val = tuple()

if not val:
    print('Tuple is empty.')
else:
    print(val)

OUTPUT

Tuple is empty.

FAQs

Is != Valid in Python?

Yes. != is an expression for ‘not equals’ in layman terms. Moreover, you can use the ‘not’ keyword to keep your code clean and remove any existing ‘!=’.

Is not and != in Python same?

Technically No. != checks if the two objects have the same value, whereas is not keyword compares the memory of two objects.

Is not equal to Python?

Is not is a famous keyword that is used instead of != in Python. It compares the memory locations of two objects and returns True if those are the same. Beware: Is not checks the memory while != checks the value. Choose Wisely!

See Also

Conclusion

The if-not clearly shows the versatile nature of Python as a language. Without this logical operator, it would become difficult to check if the list, string, or tuple is empty.

The post Simplifying If Not Logical Operator in Python appeared first on Python Pool.

May 06, 2021 08:23 AM UTC

[Solved] No Module Named Tensorflow Error

Python is known for its versatile syntax and English-like keywords. With thousands of modules, you can do data visualization, data processing and even deploy machine learning models. There are many known machine learning models published which help you, namely, Keras, Sklearn, Tensorflow, and PyTorch. Although, while using Tensorflow, you can encounter a No Module named Tensorflow error while running your first program.

No Module Named Tensorflow Error is a known error that arises when the Python Environment is unable to fetch TensorFlow files in site-packages. There are two main reasons for this error to appear, either you have not installed the TensorFlow external module or you are working on a different python environment that doesn’t have Tensorflow. There are several easy ways to fix this error which are mentioned later in this post. Let’s understand the root cause of the error before jumping on the solution.

What is No Module Named Tensorflow Error?

No Module Named Tensorflow Error

When a module is absent from the external site-library of the environment, the Python interpreter throws ModuleNotFoundError No Module Named Tensorflow. This error arises most of the time on low-end devices because TensorFlow requires proper setup of the c++ path and other requirements. Most of the time, this error is solved by using the pip install method. But if you have multiple Python versions installed, then you’ll definitely face this error.

Why do I get No Module Named Tensorflow Error?

There is no other reason for the No Module Named Tensorflow error other than missing module files. The main problem arises when you’re using multiple python versions and their virtual environment. Keep in mind that, Anaconda, PyCharm, Jupyter, and Spyder have their own virtual environment and it’s tricky to install modules in that environment.

Causes of No Module Named Tensorflow Error

There are some known causes of this ModuleNotFoundError. As Python lets you handle these errors easily, you can debug them quickly. Following are the cause of the No Module Tensorflow error –

Module Not Installed

If you haven’t installed TensorFlow yet and tried importing TensorFlow in code, then it’ll throw this error. Modules are managed by ‘pip’ which is a package management system for your python. Most of the time, the users forget to tick Add Python to the PATH option while installing it. This creates problems in managing the modules.

Supporting Module Not Installed

Tensorflow has many other supporting modules like numpy, scipy, jupyter, matplotlib, pillow, and scikit-learn. If any of these modules is absent, then it’ll throw an error. Make sure that these modules exist in your library.

Moreover, there are other supporting TensorFlow modules like, tensorflow-addons, tensorflow.contrib which might be absent from your library leading to this error.

Tip: To check which libraries are installed in your environment, enter pip list on your console.

Working on Different Virtual Environment

Virtual Environment: Method by which you isolate your working Python environment from the globally installed Python. This environment has its own installation directories and doesn’t share the libraries from globally installed Python.

Many of the code editors in Windows come with their own virtual environment. Each of these environments acts independently to global python installation and is started with blank external modules. Many times, you install a module but it’s installed on global python, not the python from your virtual environment. This can lead to ModuleNotFoundError No Module named Tensorflow in the code execution.

Code editors namely, Anaconda, Jupyter, and Spyder have their own virtual environment. If you have a similar case, head over to the corresponding solution for each code editor.

Solutions for No Module Named Tensorflow

Following are the solutions for this error in each code editor and OS –

Windows

No Module Tensorflow Windows

In Windows, the path-related issues harras the programmers all the time. As you have limited functionality over the terminal, you’re constantly facing with ModuleNotFoundError error. To install TensorFlow in your Windows, make sure you follow these steps –

  1. Uninstall existing python versions to avoid any conflicts.
  2. Go to Python.org and download the Python setup. Make sure you download the 64-bit version. Unfortunately, Tensorflow is not supported in 32 bit systems.
  3. Double click on the installer and select the “Add Python x.x to your PATH” option. This will ensure, that python executes from the path.
  4. After installing, open the command terminal or PowerShell and enter the command pip install tensorflow in it.
  5. Wait for it to finish the installation and run your python file by command python file.py

Linux

In Linux, it’s relatively easier to install TensorFlow. First of all, check if you are working on a virtual environment by a command which python. This command will return the path of python which you’re going to execute. If you are in a virtual environment, either leave the environment directory or enter the command deactivate to deactivate the virtual environment.

Follow these steps to install Tensorflow in Linux –

sudo apt-get remove python python3
sudo apt-get install python3
sudo pip3 install tensorflow

Mac

In Mac, No Module named Tensorflow is a persistent error because of environment errors. If you are working on your virtual environment, you need to deactivate and activate it again. Then use the pip list command to check if the TensorFlow module exists in your library. If not, then use the pip3 install tensorflow to install TensorFlow.

Anaconda

No Module Named Tensorflow in Anaconda

If you’re using Anaconda and you face no module named Tensorflow error, then you probably haven’t installed TensorFlow in the conda environment. As anaconda has a different environment than your default python environment, you need to install TensorFlow in it. To do it follow these steps –

  1. Open Anaconda Prompt on your computer.
  2. Enter the command conda install tensorflow in it.
  3. Wait for the installation to complete and restart the conda shell and run your program.

Jupyter

If you’ve installed Juptyter Notebook from Anaconda, it’ll use a conda environment. By default, the libraries in this environment need to be installed via command. To do the same, open your Conda Prompt and enter the command conda install tensorflow. This will ensure that your Juptyer Notebook has TensorFlow in it.

If your Jupyter is not installed via Anaconda, then use the pip install tensorflow to install the TensorFlow module. This will resolve the error ModuleNotFoundError No module named Tensorflow instantly.

Spyder

Spyder is installed via Anaconda which operates an Anaconda environment. Simply use the command conda install tensorflow in your Anaconda Prompt to install TensorFlow.

PyCharm

PyCharm is a special application that operates in its own virtual environment. Due to the unavailability of the TensorFlow error, you can face the no module found error. Follow these steps to install TensorFlow –

  1. Press Settings and select the Project Interpreter tab under projects.
  2. Now enter Tensorflow in the box and install it.
  3. After installing the package, your error will be resolved.

Supporting ModuleNotFoundError for Tensorflow

Tensorflow has many addons which come in handy to avoid writing long code. These addons and contributions are added separately in other packages and combine with the original TensorFlow module. Following are the examples of ModuleNotFoundError –

No module named ‘tensorflow.contrib’

Unfortunately, the contrib module in TensorFlow is not included in version 2.0. If you still want to use the contrib module, you’ll have to install the previous version of TensorFlow. Follow these steps –

pip uninstall tensorflow
pip install tensorflow==1.13.2

No module named ‘tensorflow_addons’

Use pip install tensorflow-addons to install the addons for TensorFlow.

No Module Named Tensorflow Still Not Resolved?

If you’ve tried all the methods and were still not able to solve the issue then, there might be some hardware limitations. Tensorflow requires Python 3.5-3.7, 64-bit system, and pip>=19.0. If you’re unable to fulfill these hardware + software requirements, then don’t worry, we still have a solution for you!

Google released a free product named ‘Colab‘ in 2018. Colab allows you to run and test machine learning models online. To explain more, it’s a replica of the jupyter notebook with all modules installed. To use Tensorflow in Google Colab follow these steps –

  1. Sign in to your Google account.
  2. Open Colab in your browser.
  3. Create a new Jupyter Notebook. (This notebook will be saved in your google drive).
  4. Type your code and run it.
Using Tensorflow in Colab

Colab has many libraries like TensorFlow, numpy, pandas, etc pre-installed in its shell. Make sure you make good use of it.

Tip: Do not use Colab to store/process peer-to-peer files. This may result in a ban!

See Also

Conclusion

Tensor flow has a flexible architecture. The easy deployment of the code makes it special in nature. However, we have to be very careful before using it. Any small syntax error can result in incorrect importing of the library.

The post [Solved] No Module Named Tensorflow Error appeared first on Python Pool.

May 06, 2021 08:15 AM UTC


Django Weblog

Django security releases issued: 3.2.2, 3.1.10, and 2.2.22

In accordance with our security release policy, the Django team is issuing Django 3.2.2, Django 3.1.10, and Django 2.2.22. These releases address the security issue with severity "moderate" detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2021-32052: Header injection possibility since URLValidator accepted newlines in input on Python 3.9.5+

On Python 3.9.5+, URLValidator didn't prohibit newlines and tabs. If you used values with newlines in HTTP response, you could suffer from header injection attacks. Django itself wasn't vulnerable because HttpResponse prohibits newlines in HTTP headers.

Moreover, the URLField form field which uses URLValidator silently removes newlines and tabs on Python 3.9.5+, so the possibility of newlines entering your data only existed if you are using this validator outside of the form fields.

This issue was introduced by the bpo-43882 fix.

Affected supported versions

  • Django main branch
  • Django 3.2
  • Django 3.1
  • Django 2.2

Resolution

Patches to resolve the issue have been applied to Django's main branch and to the 3.2, 3.1, and 2.2 release branches. The patches may be obtained from the following changesets:

The following releases have been issued:

The PGP key ID used for this release is Mariusz Felisiak: 2EF56372BA48CD1B.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

May 06, 2021 06:17 AM UTC

May 05, 2021


PyCon

A message to the Python community from our new Visionary Sponsor: Microsoft!

Microsoft has been a Keystone PyCon sponsor for the past 3 years and is now proud to become a Visionary Sponsor for this year’s edition. They’re excited to share what’s new with Python at Microsoft, as well as their plans for the virtual PyCon experience this year. Check below to see what they have to say!

We’re thrilled to be once more sponsoring PyCon. the event the Python team at Microsoft most looks forward to every year. Although we’re sad we still can’t see each other in person, with this year’s virtual edition we can connect with folks from all around the world, and that is truly amazing.   

We’ve been working on many exciting new things over the past year, building and improving tools that can support you for all needs and in all phases of your Python development. We’re listing here a few of the things we're delighted to share out with the Python community, but we hope to talk more about them with you all during PyCon.

With the goal of empowering Python development across the globe, we have CPython core developers putting a lot of effort into improving the language, and we have also been contributing to open-source packages such as PyTorch, pandas, Dask, Jupyter, nteract, scikit-learn and more. But to take that goal to the next level, Microsoft has recently become a Visionary Sponsor  of the Python Software Foundation. We believe this is a key step towards advancing the development of the language and its ecosystem, and our teams couldn’t be prouder of it.

But, we have also been working hard to improve our tools and services to help Python developers to achieve more. On the code development side of things, we made the Python editing experience in Visual Studio Code more performant and user-friendly via our new language server, Pylance.  Pylance uses our open-source static type checker, Pyright, under the hood to provide auto-completions, docstrings, type hints, code navigation, refactoring, and much more. The Python extension has also been improved with many features added such as Poetry support, Tensorboard integration, and a feature-rich data viewer that you can access when you’re debugging or working with Jupyter Notebooks. 

Speaking of notebooks, we broke out the notebook support that used to live in the Python extension into the Jupyter extension, which now offers an optimal experience for editing notebooks and counts with features such as auto-completions, syntax checking, plot viewer, HTML or PDF export and more for the Python language! Still on the VS Code world, we have also released an enhanced developer experience that makes it easier to work directly on your Azure Machine Learning (ML) compute instances from within VS Code, through an integration in the Azure ML Studio. 

On the deployment side, you can use VS Code alongside Azure and GitHub Actions for your Python applications’ CI/CD processes, allowing you to easily build, test and deploy your apps to the cloud. And with our new preview of Azure Static Web Apps, you can start using modular and extensible patterns to deploy apps in minutes, while taking advantage of the built-in scaling and cost-savings offered by serverless technologies. That means you can use Azure Functions to create serverless APIs using Flask, Django or FastAPI for the backend of your Static Web Apps.  

While we’re at the cost-saving and scalability topic, improvements have also been made on the data storage and management side of things. There’s a new single-node offering for Hyperscale (Citus), a built-in deployment option in our fully managed Azure Database for PostgreSQL. With this option, you can build faster, cost-effective, horizontally scalable PostgreSQL solutions, thanks to the open-source Citus 10 release. 

Finally, we have also just released version 1.0 of msticpy, an open-source, extensible Python toolkit built by Microsoft's Threat Intelligence Center designed to be used in Jupyter notebooks for Cyber security investigations and threat hunting. Features include data query providers, visualization, data enrichment (such as threat Intelligence matching, IP geolocation) and data analysis for anomaly detection. 

These are a few of the things we’ll be covering on PyCon this year, but what we’re looking forward to the most is connecting with you all. We want to hear what you think we can do to keep supporting the Python community! If you’re attending PyCon, make sure to join us at our workshop, live demos and to stop by our virtual booth at Hubilo. You can also join us at our Microsoft Discord Server to participate on our virtual labs – you may even get some (physical and virtual) swag!.  Lastly, battle it out with your peers/colleagues in the PyCon 2021 Cloud Skills Challenge: https://aka.ms/Pycon/CSC.   If you finish all learning paths and modules during the Cloud Skills Challenge you will be entered in to win a new pair of Surface Headphones!!




May 05, 2021 10:08 PM UTC


Python for Beginners

Exception Handling in Python: Writing a Robust Python Program

While writing programs in python, there may be situations where the program enters into an undesirable state called exceptions and exits the execution. This may cause loss of work done or may even cause  memory leak. In this  article, we will see how to handle those exceptions so that the program can continue executing in a normal way using exception handling in python. We will also see what are different ways to implement exception handling in python.

What are exceptions in Python?

An exception is an undesirable event/error in a program which disrupts the flow of execution of the statements in the program and stops the execution of the program if the exception is not handled by the program itself.  There are some predefined exceptions in python. We can also declare user defined exceptions by defining the exceptions by creating classes which inherit Exception class in python and then creating the exceptions using the raise keyword during the execution of the program.

In the following program, we create a dictionary and try to access the values using the keys in the python dictionary . Here, an error occurs in second print statement as "c" is not present in the dictionary as a key. The program will stop execution when an error is encountered.

#create a dictionary
myDict={"a":1,"b":2}
#this will print the value
print(myDict["a"])
#this will generate a KeyError exception and program will exit.
print(myDict["c"])

Output:

1
Traceback (most recent call last):
  File "<string>", line 6, in <module>
KeyError: 'c'

In the above code, we can see that after printing 1, program exits at second statement and notifies that KeyError has occurred. We can Handle this error and generate custom output using python try and except blocks.

How to handle exceptions in python?

To handle the exceptions that can be generated by the program, we use python try except and finally blocks in our code to execute the statements .

We write our code which has to be executed in try block. In the except block, we write the code to handle the exceptions generated by try block. In the finally block, we implement those parts of code which has to be executed in the end. It doesn’t matter whether an exception is generated or not, finally block will always be executed after try and except block.

In the following code we implement the program used in previous example using try and except block so that program terminated normally when the error is encountered.

try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will print the value
    print(myDict["a"])
    #this will generate a KeyError exception and program will exit from try  block
    print(myDict["c"])
except:
    print("Error Occurred in Program. Terminating.")

Output

1
Error Occurred in Program. Terminating.

In the above example, we can see that after executing first print statement in try block, the program does not terminate. After encountering error, it executes the statements in the except block and then terminates. Here we have to keep in mind that statements in the try block after the point at which exception occurred will not get executed.

How to handle specific exceptions in python?

To handle each exception differently, we can provide arguments to the exception blocks. When an exception is generated which is the same type that of the argument, the code in the specific block will be executed.

In the following code, we will handle KeyError exception specifically and rest of the exceptions will be handled by a normal except block.


try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will print the value
    print(myDict["a"])
    #this will generate a NameError exception and program will exit from try block
    print(a)
except(KeyError):
    print("Key is not present in the dictionary. proceeding ahead")
except: 
    print("Error occured. proceeding ahead")
try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will print the value
    print(myDict["a"])
    #this will generate a NameError exception and program will exit from try block
    print(myDict["c"])
except(KeyError):
    print("Key is not present in the dictionary. Terminating the program")
except: 
    print("Error occured. Terminating")

Output:

1
Error occured. proceeding ahead
1
Key is not present in the dictionary. Terminating the program

In the above program, we can see that KeyError has been handled specifically by passing it to an except block as a parameter and other exceptions are handled normally during exception handling in python .

When to use Finally block in exception handling in Python? 

Finally block is used when some statements in the program need to be executed  whether or not an exception is generated in the program. In programs where file handling is done or network connections are used, It is necessary that the program should terminate the connections or close the file before terminating. We put the finally block after the try and except block.

try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will print the value
    print(myDict["a"])
    #this will generate a NameError exception and program will exit from try block
    print(myDict["c"])
except(KeyError):
    print("Key is not present in the dictionary. proceeding ahead")
finally:
    print("This is the compulsory part kept in finally block and will always be executed.")

Output:

1
Key is not present in the dictionary. proceeding ahead
This is the compulsory part kept in finally block and will always be executed.

In the above code, we can see that the try block has raised an exception, the exception has been handled by the except block and then the finally block is executed at last.

When to use else block in exception handling in Python?

We can also use else block with python try except block when we need to execute certain statements of code after the successful execution of the statements in the try block. Else block is written after try and except block. Here, we have to keep in mind that errors/exceptions generated in the else block are not handled by the statements in the except block.


try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will print the value
    print(myDict["a"])
except:
    print("I am in except block and will get executed when an exception occurs in try block")
else:
    print("I am in else block and will get executed every time after try block is executed successfully.")

Output:

1
I am in else block and will get executed every time after try block is executed successfully.

In the above program, we can see that the code in else block has executed when try block has executed successfully. If try block raises an exception then only the except block will get executed. If the try block generates an exception then the code in the else block will not be executed.

How to generate user defined exceptions in Python?

We can also put constraints on some values by using exception handling in python. To generate a user defined exception, we use the “raise” keyword when a certain condition is met. The exception is then handled by the except block of the code.

To create a user defined exception, we create a class with desired exception name which should inherit Exception class. After that, we can raise the exception in our code anywhere according to our need to implement constraints.

#create an exception class
class SmallNumberException (Exception):
    pass

try:
    #create a dictionary
    myDict={"a":1,"b":2}
    #this will raise SmallNumberException
    if(myDict["a"]<10):
        raise SmallNumberException
except(SmallNumberException):
    print("The Number is smaller than 10")

Output:

The Number is smaller than 10

In the above code, we can see that a user defined exception has been created which inherits Exception class and is raised after a conditional statement to check whether the number is smaller than 10 or not. We can use user defined exceptions anywhere to add constraints on values of a variable in the program.

Conclusion

In this article, we have learned about exceptions and exception handling in python. We also studied how to implement try, except ,finally and else block during exception handling. Also, we have studied how to create custom user defined errors and exceptions to implement constraints on the variables.

The post Exception Handling in Python: Writing a Robust Python Program appeared first on PythonForBeginners.com.

May 05, 2021 02:43 PM UTC


Real Python

Natural Language Processing With Python's NLTK Package

Natural language processing (NLP) is a field that focuses on making natural human language usable by computer programs. NLTK, or Natural Language Toolkit, is a Python package that you can use for NLP.

A lot of the data that you could be analyzing is unstructured data and contains human-readable text. Before you can analyze that data programmatically, you first need to preprocess it. In this tutorial, you’ll take your first look at the kinds of text preprocessing tasks you can do with NLTK so that you’ll be ready to apply them in future projects. You’ll also see how to do some basic text analysis and create visualizations.

If you’re familiar with the basics of using Python and would like to get your feet wet with some NLP, then you’ve come to the right place.

By the end of this tutorial, you’ll know how to:

  • Find text to analyze
  • Preprocess your text for analysis
  • Analyze your text
  • Create visualizations based on your analysis

Let’s get Pythoning!

Free Download: Get a sample chapter from Python Basics: A Practical Introduction to Python 3 to see how you can go from beginner to intermediate in Python with a complete curriculum, up-to-date for Python 3.8.

Getting Started With Python’s NLTK

The first thing you need to do is make sure that you have Python installed. For this tutorial, you’ll be using Python 3.9. If you don’t yet have Python installed, then check out Python 3 Installation & Setup Guide to get started.

Once you have that dealt with, your next step is to install NLTK with pip. It’s a best practice to install it in a virtual environment. To learn more about virtual environments, check out Python Virtual Environments: A Primer.

For this tutorial, you’ll be installing version 3.5:

$ python -m pip install nltk==3.5

In order to create visualizations for named entity recognition, you’ll also need to install NumPy and Matplotlib:

$ python -m pip install numpy matplotlib

If you’d like to know more about how pip works, then you can check out What Is Pip? A Guide for New Pythonistas. You can also take a look at the official page on installing NLTK data.

Tokenizing

By tokenizing, you can conveniently split up text by word or by sentence. This will allow you to work with smaller pieces of text that are still relatively coherent and meaningful even outside of the context of the rest of the text. It’s your first step in turning unstructured data into structured data, which is easier to analyze.

When you’re analyzing text, you’ll be tokenizing by word and tokenizing by sentence. Here’s what both types of tokenization bring to the table:

  • Tokenizing by word: Words are like the atoms of natural language. They’re the smallest unit of meaning that still makes sense on its own. Tokenizing your text by word allows you to identify words that come up particularly often. For example, if you were analyzing a group of job ads, then you might find that the word “Python” comes up often. That could suggest high demand for Python knowledge, but you’d need to look deeper to know more.

  • Tokenizing by sentence: When you tokenize by sentence, you can analyze how those words relate to one another and see more context. Are there a lot of negative words around the word “Python” because the hiring manager doesn’t like Python? Are there more terms from the domain of herpetology than the domain of software development, suggesting that you may be dealing with an entirely different kind of python than you were expecting?

Here’s how to import the relevant parts of NLTK so you can tokenize by word and by sentence:

>>>
>>> from nltk.tokenize import sent_tokenize, word_tokenize

Now that you’ve imported what you need, you can create a string to tokenize. Here’s a quote from Dune that you can use:

>>>
>>> example_string = """
... Muad'Dib learned rapidly because his first training was in how to learn.
... And the first lesson of all was the basic trust that he could learn.
... It's shocking to find how many people do not believe they can learn,
... and how many more believe learning to be difficult."""

You can use sent_tokenize() to split up example_string into sentences:

>>>
>>> sent_tokenize(example_string)
["Muad'Dib learned rapidly because his first training was in how to learn.",
'And the first lesson of all was the basic trust that he could learn.',
"It's shocking to find how many people do not believe they can learn, and how many more believe learning to be difficult."]

Read the full article at https://realpython.com/nltk-nlp-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 05, 2021 02:00 PM UTC


Python Bytes

#232 PyPI in a box and a revolutionary keyboard

<p><strong>Watch the live stream:</strong></p> <a href='https://www.youtube.com/watch?v=zYZR2WCDn6A' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://pragprog.com/titles/bopytest/python-testing-with-pytest/"><strong>pytest book</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p>Special guest: <a href="https://twitter.com/annette4web"><strong>Annette Lewis</strong></a></p> <p><strong>Brian #1:</strong> <a href="https://sphinx-themes.org/"><strong>Sphinx Themes Gallery</strong></a> update</p> <ul> <li>Curated and maintained by <a href="https://github.com/pradyunsg">@pradyunsg</a> and <a href="https://github.com/shirou">@shirou</a>.</li> <li>I actually don’t know what it looked like before, but this is great.</li> <li>I’m working on my first real Sphinx project, so this is awesome to have.</li> <li>Features: <ul> <li>Main image for each theme shows what theme looks like in wide, narrow, and phone layout</li> <li>Demos (click on an image): <ul> <li>Main page that shows you</li> <li>quick start: install and config theme name</li> <li>Link to theme documentation</li> <li>Example of Navigation </li> </ul></li> <li>Kitchen sink <ul> <li>paragraph level markup <ul> <li>including inline, math, meta, blocks, code with sidebars, references, directives, footnotes, and more</li> <li>API documentation example</li> <li>essential if you are using this for documenting code</li> </ul></li> </ul></li> <li>Lists and tables</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href="https://github.com/scottrogowski/mongita"><strong>Mongita - Like SQLite but for MongoDB</strong></a></p> <ul> <li>Mongita is a lightweight embedded document database that implements a commonly-used subset of the <a href="https://pymongo.readthedocs.io/en/stable/">MongoDB/PyMongo interface</a>.</li> <li>Instead of being a server, Mongita is a self-contained Python library</li> <li>Mongita can be configured to store its documents either on disk or in memory.</li> <li>This is a great project to contribute to as a new open source person, <a href="https://github.com/scottrogowski/mongita#contributing">details</a>.</li> <li>Uses: <ul> <li><strong>Embedded database</strong>: Mongita is a good alternative to <a href="https://www.sqlite.org/index.html">SQLite</a> for embedded applications when a document database makes more sense than a relational one.</li> <li><strong>Unit testing</strong>: Mocking PyMongo/MongoDB is a pain. Worse, mocking can hide real bugs. By monkey-patching PyMongo with Mongita, unit tests can be more faithful while remaining isolated.</li> </ul></li> <li><strong>Limited dependencies</strong>: Mongita runs anywhere that Python runs. Currently the only dependencies are <code>pymongo</code> (for bson) and <code>sortedcontainers</code> (for faster indexes).</li> </ul> <p><strong>Annette #3:</strong> <a href="https://plone.org/news/2021/world-plone-day-2021-over-50-videos-from-16-countries"><strong>World Plone Day 2021 - Over 50 Videos from 16 Countries</strong></a></p> <ul> <li><strong><em>World Plone Day</em></strong> <em>was 24-hour online streaming event held on April 28th 2021.</em> <ul> <li>Plone open-source Content Management system, written in Python and built on top of the Zope web framework</li> </ul></li> <li>Plone community produced 56 videos totaling 22 hours of content.</li> <li>More than 50 speakers from 16 countries, 11 languages.</li> <li>All available on Youtube - <a href="https://www.youtube.com/playlist?list=PLGN9BI-OAQkRPGO2girNMya_WekTykd4y">World Plone Day 2021 playlist</a></li> <li>Variety of content categories: <ul> <li>General Interest</li> <li>Technical Talks</li> <li>Case Studies</li> <li>Plone 6 <ul> <li>Plone 6 introduction</li> <li>How does Plone 6 work under the hood?</li> <li>Getting Started with Volto Customization</li> </ul></li> </ul></li> </ul> <p><strong>Brian #4:</strong> <a href="https://snarky.ca/the-social-contract-of-open-source/"><strong>The social contract of open source : view every commit as a gift</strong></a></p> <ul> <li>Brett Cannon</li> <li>Interesting thoughts on what “contract” and what relationship exists between maintainer and user.</li> <li>Great analogy of a stack of USB drives with source code on front lawn with a “FREE” sign. <ul> <li>Come by and pick up the latest release whenever you want</li> <li>No guarantee at all</li> <li>Each new version is a gift that you can accept or not</li> <li>Receiver of gift should NOT:</li> <li>knock on front door and yell at developer</li> <li>Leave an angry letter in the mailbox</li> <li>Stand in middle of street in town yelling about how much they hate the software or how much of an idiot the developer is</li> </ul></li> <li>Quote from Immanuel Kant: “<strong>Act in such a way that you treat humanity, whether in your own person or in the person of any other, never merely as a means to an end, but always at the same time as an end.”</strong></li> <li>Brett: “… when you treat a maintainer as a fellow human being who may be able to do you a favor of their own volition, then you end up in an appropriate relationship where you are not trying to <em>use</em> the maintainer for something specific.</li> <li>Summary: “<strong>Every commit of open source code should be viewed as an independent gift from the maintainer that they happened to leave on their front yard for others to enjoy if they so desire; treating them as a means to and for their open source code is unethical.”</strong></li> </ul> <p><strong>Michael #5:</strong> <a href="https://www.builtinafrica.io/blog-post/vuyisile-ndlovu-pypi"><strong>PyPI in a box</strong></a></p> <ul> <li>via <a href="https://twitter.com/jaredchung">Jared Chung</a></li> <li>Connectivity is still a challenge in many countries, especially Africa</li> <li>Vuyisile Ndlovu created <a href="https://vuyisile.com/pypi-in-a-box-using-a-raspberry-pi-as-a-portable-pypi-server/">PyPI in a Box</a>. Post PyCon Africa, in the conference slack group, attendees shared the most common problems across the continent, and the state of internet connectivity was the overwhelming response. </li> <li>Vuyisile also references putting “StackOverflow in a box” but the article doesn’t lay out how to do it.</li> </ul> <p><strong>Annette #6:</strong> <a href="https://kevinmartinjose.com/2021/04/27/film-simulations-from-scratch-using-python/"><strong>Film simulations from scratch using Python</strong></a></p> <ul> <li>by Kevin Martin Jose</li> <li>Implementing applying CLUTs (Color Look up table) to an image with Python</li> <li>Opens the Image with PIL then converts it into numpy array</li> <li>Iterates through all the pixels values and assigns it to LUT color cell</li> <li>Returns the filtered Image from the array</li> </ul> <p><strong>Extras</strong></p> <p><strong>Michael</strong></p> <ul> <li>Talked about HTMX, Akira K. pointed out <a href="https://hyperscript.org/"><strong>Hyperscript</strong></a> as a companion. Careful, it’s super new.</li> <li>Dask course is out: <a href="https://twitter.com/TalkPython/status/1389382566965178375"><strong>https://twitter.com/TalkPython/status/1389382566965178375</strong></a> </li> <li><a href="https://testdriven.io/talkpython"><strong>FastAPI bundle fund raiser with testdriven.io</strong></a></li> <li><a href="https://blog.python.org/2021/05/python-3810-395-and-3100b1-are-now.html"><strong>Python 3.10b1 is out</strong></a></li> <li><a href="https://devblogs.microsoft.com/python/supporting-the-python-community/"><strong>Microsoft becomes 3rd PSF Visionary Sponsor</strong></a>, joining Bloomberg and Google (via PyCoders)</li> </ul> <p><strong>Annette</strong></p> <ul> <li>Python Web Conf 2022 <ul> <li>March 21-25, 2022</li> <li>The Call for Papers is now open: <a href="https://www.papercall.io/pwc-2022">https://www.papercall.io/pwc-2022</a></li> </ul></li> </ul> <p><strong>Joke</strong> </p> <p><a href="https://twitter.com/cecilphillip/status/1377651787331739650"><strong>A developer-focused keyboard</strong></a> <strong>(</strong><a href="https://149351115.v2.pressablecdn.com/wp-content/uploads/2021/03/april-fools-social-2048x1072.jpg"><strong>graphic</strong></a><strong>)</strong></p>

May 05, 2021 08:00 AM UTC


John Ludhi/nbshare.io

Stock Sentiment Analysis Using Autoencoders

Stock Sentiment Analysis Using Autoencoders

In this notebook, we will use autoencoders to do stock sentiment analysis. Autoencoder consists of encoder and decoder models. Encoders compress the data and decoders decompress it. Once you train an autoencoder neural network, the encoder can be used to train a different machine learning model.

For stock sentiment analysis, we will first use encoder for the feature extraction and then use these features to train a machine learning model to classify the stock tweets. To learn more about Autoencoders check out the following link...

https://www.nbshare.io/notebook/86916405/Understanding-Autoencoders-With-Examples/

Stock Tweets Data

Let us import the necessary packages.

In [1]:
# importing necessary lib 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
In [2]:
# reading tweets data
df=pd.read_csv('/content/stocktwits (2).csv')
In [3]:
df.head()
Out[3]:
ticker message sentiment followers created_at
0 atvi $ATVI brutal selloff here today... really dumb... Bullish 14 2020-10-02T22:19:36.000Z
1 atvi $ATVI $80 around next week! Bullish 31 2020-10-02T21:50:19.000Z
2 atvi $ATVI Jefferies says that the delay is a &quot... Bullish 83 2020-10-02T21:19:06.000Z
3 atvi $ATVI I’ve seen this twice before, and both ti... Bullish 5 2020-10-02T20:48:42.000Z
4 atvi $ATVI acting like a game has never been pushed... Bullish 1 2020-10-02T19:14:56.000Z

Let us remove the unnecessary features - ticker, followers and created_at from our dataset.

In [4]:
df=df.drop(['ticker','followers','created_at'],axis=1)
In [5]:
df.head()
Out[5]:
message sentiment
0 $ATVI brutal selloff here today... really dumb... Bullish
1 $ATVI $80 around next week! Bullish
2 $ATVI Jefferies says that the delay is a &quot... Bullish
3 $ATVI I’ve seen this twice before, and both ti... Bullish
4 $ATVI acting like a game has never been pushed... Bullish
In [6]:
# class counts
df['sentiment'].value_counts()
Out[6]:
Bullish    26485
Bearish     4887
Name: sentiment, dtype: int64

If you observe the above results.Our data set is imabalanced. The number of Bullish tweets are way more than the Bearish tweets. We need to balance the data.

In [7]:
# Sentiment encoding 
# Encoding Bullish with 0 and Bearish with 1 
dict={'Bullish':0,'Bearish':1}

# Mapping dictionary to Is_Response feature
df['Class']=df['sentiment'].map(dict)
df.head()
Out[7]:
message sentiment Class
0 $ATVI brutal selloff here today... really dumb... Bullish 0
1 $ATVI $80 around next week! Bullish 0
2 $ATVI Jefferies says that the delay is a &quot... Bullish 0
3 $ATVI I’ve seen this twice before, and both ti... Bullish 0
4 $ATVI acting like a game has never been pushed... Bullish 0

Let us remove the 'sentiment' feature since we have already encoded it in the 'class' column.

In [8]:
df=df.drop(['sentiment'],axis=1)

To make our dataset balanced, in the next few lines of code, I am taking same number of samples from Bullish class as we have in Bearish class.

In [9]:
Bearish = df[df['Class']== 1]
Bullish = df[df['Class']== 0].sample(4887)
In [10]:
# appending sample records of majority class to minority class
df = Bullish.append(Bearish).reset_index(drop = True)

Let us check how our dataframe looks now.

In [11]:
df.head()
Out[11]:
message Class
0 Options Live Trading with a small Ass account... 0
1 $UPS your crazy if you sold at open 0
2 If $EQIX is at $680, this stock with the bigge... 0
3 $WMT just getting hit on the no stimulus deal.... 0
4 $AMZN I&#39;m playing the catalyst stocks with... 0

Let us do count of both the classes to make sure count of each class is same.

In [12]:
# balanced class 
df['Class'].value_counts()
Out[12]:
1    4887
0    4887
Name: Class, dtype: int64
In [13]:
df.message
Out[13]:
0       Options  Live Trading with a small Ass account...
1                     $UPS your crazy if you sold at open
2       If $EQIX is at $680, this stock with the bigge...
3       $WMT just getting hit on the no stimulus deal....
4       $AMZN I&#39;m playing the catalyst stocks with...
                              ...                        
9769    SmartOptions® Unusual Activity Alert\n(Delayed...
9770                                            $VNO ouch
9771                                             $VNO dog
9772    $ZION I wanted to buy into this but I had an u...
9773    $ZOM Point of Care, rapid tests from $IDXX and...
Name: message, Length: 9774, dtype: object

Stock Tweets Text to Vector Form

Now we need to convert the tweets(text) into vector form.

To convert text into vector form, first we need to clean the text, Cleaning means removing special characters, lowercasing , remvoing numericals,stemming etc

For text preprocessing I am using NLTK lib.

In [14]:
import nltk
nltk.download('stopwords')
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
Out[14]:
True
In [15]:
import re
In [16]:
# I am using porterstemmer for stemming 
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
ps = PorterStemmer()
corpus = []
for i in range(0, len(df)):

  review = re.sub('[^a-zA-Z]', ' ', df['message'][i])
  review = review.lower()
  review = review.split()
  review = [ps.stem(word) for word in review if not word in stopwords.words('english')]
  review = ' '.join(review)
  corpus.append(review)

To convert words into vector I am using TF-IDF.

In [18]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
In [19]:
# I am using 1 to 3 ngram combinations
tfidf=TfidfVectorizer(max_features=10000,ngram_range=(1,3))
tfidf_word=tfidf.fit_transform(corpus).toarray()
tfidf_class=df['Class']
In [20]:
tfidf_word
Out[20]:
array([[0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       ...,
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.20443663,
        0.        ]])
In [21]:
# importing necessary lib 
import pandas as pd 
import numpy as np
from sklearn.model_selection import train_test_split 
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import MinMaxScaler 
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import seaborn as sns
from keras.layers import Input, Dense
from keras.models import Model, Sequential
from keras import regularizers
In [22]:
tfidf_class
Out[22]:
0       0
1       0
2       0
3       0
4       0
       ..
9769    1
9770    1
9771    1
9772    1
9773    1
Name: Class, Length: 9774, dtype: int64

Scaling the data

To make the data suitable for the auto-encoder, I am using MinMaxScaler.

In [23]:
X_scaled = MinMaxScaler().fit_transform(tfidf_word)
X_bulli_scaled = X_scaled[tfidf_class == 0]
X_bearish_scaled = X_scaled[tfidf_class == 1]
In [25]:
tfidf_word.shape
Out[25]:
(9774, 10000)

Building the Autoencoder neural network

I am using standard auto-encoder network.

For encoder and decoder I am using 'tanh' activation function.

For bottle neck and output layers I am using 'relu' activation.

I am using L1 regularizer in Encoder. To learn more about regularlization check here.

In [26]:
# Building the Input Layer
input_layer = Input(shape =(tfidf_word.shape[1], ))
  
# Building the Encoder network
encoded = Dense(100, activation ='tanh',
                activity_regularizer = regularizers.l1(10e-5))(input_layer)
encoded = Dense(50, activation ='tanh',
                activity_regularizer = regularizers.l1(10e-5))(encoded)
encoded = Dense(25, activation ='tanh',
                activity_regularizer = regularizers.l1(10e-5))(encoded)
encoded = Dense(12, activation ='tanh',
                activity_regularizer = regularizers.l1(10e-5))(encoded)
encoded = Dense(6, activation ='relu')(encoded)

# Building the Decoder network
decoded = Dense(12, activation ='tanh')(encoded)
decoded = Dense(25, activation ='tanh')(decoded)
decoded = Dense(50, activation ='tanh')(decoded)
decoded = Dense(100, activation ='tanh')(decoded)
  
# Building the Output Layer
output_layer = Dense(tfidf_word.shape[1], activation ='relu')(decoded)

Training Autoencoder

In [27]:
import tensorflow as tf

For training I am using 'Adam' Optimizer and 'BinaryCrossentropy' Loss.

In [ ]:
# Defining the parameters of the Auto-encoder network
autoencoder = Model(input_layer, output_layer)
autoencoder.compile(optimizer ="Adam", loss =tf.keras.losses.BinaryCrossentropy())
  
# Training the Auto-encoder network
autoencoder.fit(X_bulli_scaled, X_bearish_scaled, 
                batch_size = 16, epochs = 100
                , 
                shuffle = True, validation_split = 0.20)

After training the neural network, we discard the decoder since we are only interested in Encoder and bottle neck layers.

In the below code, autoencoder.layers[0] means first layer which is encoder layer. Similarly autoencoder.layers[4] means bottle neck layer. Now we will create our model with encoder and bottle neck layers.

In [29]:
hidden_representation = Sequential()
hidden_representation.add(autoencoder.layers[0])
hidden_representation.add(autoencoder.layers[1])
hidden_representation.add(autoencoder.layers[2])
hidden_representation.add(autoencoder.layers[3])
hidden_representation.add(autoencoder.layers[4])

Encoding Data

In [30]:
# Separating the points encoded by the Auto-encoder as bulli_hidden_scaled and bearish_hidden_scaled

bulli_hidden_scaled = hidden_representation.predict(X_bulli_scaled)
bearish_hidden_scaled = hidden_representation.predict(X_bearish_scaled)

Let us combine the encoded data in to a single table.

In [31]:
encoded_X = np.append(bulli_hidden_scaled, bearish_hidden_scaled, axis = 0)
y_bulli = np.zeros(bulli_hidden_scaled.shape[0]) # class 0
y_bearish= np.ones(bearish_hidden_scaled.shape[0])# class 1
encoded_y = np.append(y_bulli, y_bearish)

Now we have encoded data from auto encoder. This is nothing but feature extraction from input data using auto encoder.

Train Machine Learning Model

We can use these extracted features to train machine learning models.

In [32]:
# splitting the encoded data into train and test 

X_train_encoded, X_test_encoded, y_train_encoded, y_test_encoded = train_test_split(encoded_X, encoded_y, test_size = 0.2)

Logistic Regreession

In [33]:
lrclf = LogisticRegression()
lrclf.fit(X_train_encoded, y_train_encoded)
  
# Storing the predictions of the linear model
y_pred_lrclf = lrclf.predict(X_test_encoded)
  
# Evaluating the performance of the linear model
print('Accuracy : '+str(accuracy_score(y_test_encoded, y_pred_lrclf)))
Accuracy : 0.620460358056266

SVM

In [34]:
# Building the SVM model
svmclf = SVC()
svmclf.fit(X_train_encoded, y_train_encoded)
  
# Storing the predictions of the non-linear model
y_pred_svmclf = svmclf.predict(X_test_encoded)
  
# Evaluating the performance of the non-linear model
print('Accuracy : '+str(accuracy_score(y_test_encoded, y_pred_svmclf)))
Accuracy : 0.6649616368286445

RandomForest

In [35]:
from sklearn.ensemble import RandomForestClassifier
In [36]:
# Building the rf model
rfclf = RandomForestClassifier()
rfclf.fit(X_train_encoded, y_train_encoded)
  
# Storing the predictions of the non-linear model
y_pred_rfclf = rfclf.predict(X_test_encoded)
  
# Evaluating the performance of the non-linear model
print('Accuracy : '+str(accuracy_score(y_test_encoded, y_pred_rfclf)))
Accuracy : 0.7631713554987213

Xgbosst Classifier

In [37]:
import xgboost as xgb
In [38]:
#xgbosst classifier 
xgb_clf=xgb.XGBClassifier()
xgb_clf.fit(X_train_encoded, y_train_encoded)

y_pred_xgclf = xgb_clf.predict(X_test_encoded)

print('Accuracy : '+str(accuracy_score(y_test_encoded, y_pred_xgclf)))


  
Accuracy : 0.7089514066496164

If you observe the above accuracy's by model. Randomforest is giving good accuracy on test data. So we can tune the RFclassifier to get better accuracy.

Hyperparamter Optimization

In [39]:
from sklearn.model_selection import RandomizedSearchCV
# Number of trees in random forest
n_estimators = [int(x) for x in np.linspace(start = 200, stop = 2000, num = 10)]
# Number of features to consider at every split
max_features = ['auto', 'sqrt']
# Maximum number of levels in tree
max_depth = [int(x) for x in np.linspace(10, 110, num = 11)]
max_depth.append(None)
# Minimum number of samples required to split a node
min_samples_split = [2, 5, 10]
# Minimum number of samples required at each leaf node
min_samples_leaf = [1, 2, 4]
# Method of selecting samples for training each tree
bootstrap = [True, False]
# Create the random grid
random_grid = {'n_estimators': n_estimators,
               'max_features': max_features,
               'max_depth': max_depth,
               'min_samples_split': min_samples_split,
               'min_samples_leaf': min_samples_leaf,
               'bootstrap': bootstrap}
In [ ]:
# Use the random grid to search for best hyperparameters
# First create the base model to tune
rf = RandomForestClassifier()
# Random search of parameters, using 3 fold cross validation, 
# search across 100 different combinations, and use all available cores
rf_random = RandomizedSearchCV(estimator = rf, param_distributions = random_grid, n_iter = 25, cv = 3, verbose=2, random_state=42)
# Fit the random search model
rf_random.fit(X_train_encoded, y_train_encoded)
In [46]:
rf_random.best_params_
Out[46]:
{'bootstrap': True,
 'max_depth': 30,
 'max_features': 'sqrt',
 'min_samples_leaf': 1,
 'min_samples_split': 10,
 'n_estimators': 1000}

But these are probably not the best hyperparameters, I used only 25 iterations. We can increase the iterations further to find the best hyperparameters.

May 05, 2021 01:38 AM UTC


Python⇒Speed

How to (not) use Docker to share your password with hackers

Do you use Docker images to run your software? Does running or building your image involve a password or other credential that you really (don’t) want to share with hackers?

Well, you’re in luck, because Docker makes it really easy to share your passwords, cloud credentials, and SSH private keys with the world. Whether it’s runtime secrets, build secrets, or just some random unrelated credentials you had lying around in the wrong place, Docker’s got you covered when it comes to secret leaks.

In this article we’ll cover:

Read more...

May 05, 2021 12:00 AM UTC

May 04, 2021


PyCoder’s Weekly

Issue #471 (May 4, 2021)

#471 – MAY 4, 2021
View in Browser »

The PyCoder’s Weekly Logo


The Hidden Performance Overhead of Python C Extensions

It’s no secret that Python is slower than compiled languages like C, C++, and Rust. If you need a performance boost, you can write compiled Python C extensions. But there are some hidden performance costs that you should be aware of if you decide to do this. This article explains two ways that Python C extensions can actually be slower than pure Python and discusses some solutions and work around for them.
ITAMAR TURNER-TRAURING

How to Use ipywidgets to Make Your Jupyter Notebook Interactive

Jupyter Notebooks are great for exploratory data analysis. They’re also a good way to share results and analysis with other people, who can alter the notebook to further explore the data themselves. But there are some limitations to notebook interactivity. That’s where ipywidgets comes in! In this tutorial you’ll learn how to create widgets like check boxes, drop-down menus, sliders, and how to handle events like button clicks.
MATT WRIGHT

Rapidly Identify Bottlenecks in Your Python Applications with Datadog APM.

alt

Datadog’s Continuous Profiler allows you to find the most resource-consuming parts in your production code all the time, at any scale, with minimal overhead. Debug and optimize your code, enhancing application performance before your customers notice. Try Datadog APM today →
DATADOG sponsor

Build a Platform Game in Python With arcade

Building games can be a fun way to learn new Python concepts and practice techniques you’ve already learned. Plus, they make for great projects to share! This step-by-step tutorial shows you how to build a platform game using the arcade library. You’ll learn techniques for designing levels, sourcing assets, and implementing advanced features.
REAL PYTHON

Python 3.10.0b1 Is Now Available

Python 3.8.10 and 3.9.5 have also been released.
CPYTHON DEV BLOG

Microsoft Becomes the Third PSF Visionary Sponsor

MICROSOFT.COM

EuroPython 2021: Call for Sponsors

EUROPYTHON.EU

Django Security Releases Issued: 3.2.1, 3.1.9, and 2.2.21

DJANGO SOFTWARE FOUNDATION

Discussions

“WARNING: Value for scheme.data Does Not Match” When I Try to Update Pip or Install Packages

Are you seeing warnings about scheme.data, scheme.platlib, and other scheme.* items when installing with pip? While these warnings don’t affect the pip installation, they are noisy and annoying. Fortunately, the pip team has fixed them in the latest patch release, so upgrading to pip 21.1.1 or later should get rid of them for you.
STACK OVERFLOW

The Most Copied Comment on Stack Overflow Is on How to Resize Figures in Matplotlib

In a recent blog post, Stack Overflow analyzed data the obtained during an April Fools gag about how people copy and paste code from their platform. One of the results from the data set indicates that to most copied comment comes from an answer about resizing figures in Matplotlib.
REDDIT

Python Jobs

Senior Full Stack Engineer (USA)

Yonder

Software Engineer (New York, NY, USA)

Truveris

Software Engineer (Remote)

Sagebeans Rpo

PL/SQL Developer with Python (Remote)

Dotcom Team, LLC

More Python Jobs >>>

Articles & Tutorials

Film Simulations From Scratch Using Python

In analog photography, you can achieve different “looks” for your photographs by selecting different kinds of film to shoot with. Digital camera manufacturers often include different presets to simulate different kinds of film. In this article, you’ll learn how to simulate different films on your own images using color lookup tables, or CLUTs, using NumPy and the Pillow image library.
KEVIN MARTIN JOSE

Simplify Python GUI Development With PySimpleGUI

In this step-by-step course, you’ll learn how to create a cross-platform graphical user interface (GUI) using Python and PySimpleGUI. A graphical user interface is an application that has buttons, windows, and lots of other elements that the user can use to interact with your application.
REAL PYTHON course

Are You Up For the Ubiq Ramen Challenge?

alt

PyCoder’s listen up! We want to show you how simple and fast it is to build encryption directly into any application. We think… faster than it takes to make a bowl of ramen. Enter our Ubiq Ramen Challenge and you could win a year’s supply of Ramen and a $500 Amazon gift card →
UBIQ SECURITY sponsor

Declarative Validation

Validating user input is one of the most common programming tasks. There are a number of approaches to validation and a host of third-party Python packages available on PyPI. One of these approaches that is common in the functional programming paradigm is applicative-style validation, which the author of this article calls declarative validation. In this short-yet-informative article, you’ll learn how declarative validation works and how to cook up a small validation library.
DREW OLSON

Python News: What’s New From April 2021?

April 2021 was an eventful month in the world of Python. In this article, you’ll get up to speed on everything that happened in the past month, including new sponsorships for the PSF, changes to Python error messages, and a community-led discussion over the future of type annotations.
REAL PYTHON

Dockerizing FastAPI with Postgres, Uvicorn, and Traefik

FastAPI is quickly gaining popularity in the world of asynchronous Python web frameworks. With more and more users flocking to the framework, the demand for Dockerized development and production workflows is growing. In this step-by-step tutorial, you’ll learn how to dockerize a FastAPI application using Postgres, Uvicorn, and Traefik.
AMAL SHAJI • Shared by Amal Shaji

Podcast Rewind With Guest Highlights for 2020-2021

This week’s episode of the Real Python podcast is a bit different. Take a look back in this rewind episode featuring highlights from the many interviews over the past year or so of the show.
REAL PYTHON podcast

Projects & Code

tablib: Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &C

GITHUB.COM/JAZZBAND

wasmer-python: WebAssembly Runtime for Python

GITHUB.COM/WASMERIO

gradio: Create UIs for Prototyping Your Machine Learning Model in 3 Minutes

GITHUB.COM/GRADIO-APP

tortoise-orm: Familiar Asyncio ORM for Python, Built With Relations in Mind

GITHUB.COM/TORTOISE

mongo-arrow: Easily Move Data Between MongoDB and Apache Arrow

GITHUB.COM/MONGODB-LABS

Events

Real Python Office Hours (Virtual)

May 5, 2021
REALPYTHON.COM

PyCon 2021 (Virtual)

May 12 – 18, 2021
PYCON.ORG

DjangoCon Europe 2021 (Virtual)

June 2 – 6, 2021
DJANGOCON.EU

EuroPython 2021 (Virtual)

July 26 – August 1, 2021
EUROPYTHON.EU

PyCon India 2021

September 17 – 20, 2021
PYCON.ORG


Happy Pythoning!
This was PyCoder’s Weekly Issue #471.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

May 04, 2021 07:30 PM UTC


Django Weblog

PyCharm &amp; DSF Campaign 2021 Results

The fifth annual JetBrains PyCharm promotion in April netted the Django Software Foundation $45,000 this year, a slight increase over the $40,000 raised last year.

This amount represents roughly 20% of the DSF's overall budget, which goes directly into funding the continued development and support of Django via the Django Fellowship program and Django conferences worldwide.

Django Software Foundation

The Django Software Foundation is the non-profit foundation that supports the development of the Django Web framework. It funds the Django Fellowship program, which currently supports two Fellows who triage tickets, review/merge patches from the community, and work on infrastructure. The introduction of this program starting in 2015 has gone a long way towards ensuring a consistent major release cycle and the fixing/blocking of severe bugs. DSF also funds development sprints, community events like DjangoCons, and related conferences and workshops globally.

Fundraising is still ongoing and you can donate directly at djangoproject.com/fundraising.

May 04, 2021 04:25 PM UTC


Real Python

Simplify Python GUI Development With PySimpleGUI

Creating a simple graphical user interface (GUI) that works across multiple platforms can be complicated. But it doesn’t have to be that way. You can use Python and the PySimpleGUI package to create nice-looking user interfaces that you and your users will enjoy! PySimpleGUI is a new Python GUI library that has been gaining a lot of interest recently.

In this course, you’ll learn how to:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 04, 2021 02:00 PM UTC


PyCon

Google Cloud: Speed up Python development with Cloud Code and your IDE

Google Cloud is thrilled to sponsor PyCon 2021! This is a bittersweet PyCon as we wish we could meet all you skilled Python developers in person, but we know bringing together Python developers in this virtual environment will lend itself to creative collaboration and education!

Google Cloud provides many features to make Python development more streamlined. These capabilities allow Python developers to:

  • Find libraries optimized for Python
  • Run workloads anywhere
  • Manage JupyterLab notebooks
  • Find, diagnose and x complex issues

Our Python team is always working on ways to ease the development journey for Python developers. Google Cloud has the tools Python developers need to build cloud-based applications. Our tools allow you to build apps quicker with SDKs and in-IDE assistance. Then, scale them as big or small as you need on Cloud Run, Google Kubernetes Engine (GKE), or Anthos.

As developers, your IDE is the most comfortable place for you to work. To simplify development, you can expand your IDE with extensions to make the process of building, deploying, scaling, and managing infrastructure and applications a breeze. Cloud Code is Google Cloud’s IDE extension that streamlines cloud-based Python development right within your IDE. that helps you write, debug and deploy your cloud-based apps faster.





Cloud Code helps developers increase their productivity by providing:

  • An easy way to speed up the local development loop
  • Explorers to navigate your Kubernetes and Cloud Run apps
  • Quick access to enable and install Cloud APIs
  • YAML authoring support

These helpful and intuitive features improve day-to-day workflows for all Python developers!

Cloud Code is available for an array of IDEs, including PyCharm and Visual Studio Code. Developing with Cloud Code is your key to maximizing developer productivity, combining all the best tools together so simplify Python development with Google Cloud.

Let’s dive deeper into how Cloud Code helps drive productivity while working in your IDE.

Get started fast

If you’re new to cloud-based development, especially with Google Cloud, it can take some time to get an app up and running. Cloud Code provides Python getting started samples for both Kubernetes and Cloud Run, our fully managed container offering.

After installing Cloud Code from the PyCharm or VS Code marketplace, you can create a new project using one of our samples. Depending on what framework you’re more comfortable with, we have samples for both Flask and Django. These give you a starting point for development and show best practices.

For a live demo on getting started with Cloud Code and PyCharm, check out the Developing Flask Apps on Google Cloud webinar we did with PyCharm! This webinar dives into a demo of how to build Flask applications for Cloud Run and covers how Cloud Code helps throughout the development journey.

To help you get started even faster, we created interactive tutorials on how to edit, debug, and deploy a new application with Cloud Code. Check out our Cloud Run and Secret Manager tutorials, both written in Python of course!

Our tutorials take an in depth look at the features of Cloud Code, but to give you a quick look at what our IDE tool offers, continue reading below.

Give Cloud Code a try by downloading here and join us on the Google Cloud Developers Slack!



May 04, 2021 12:44 PM UTC


Stack Abuse

Python: Make Time Delay (Sleep) for Code Execution

Introduction

Code Delaying (also known as sleeping) is exactly what the name implies, the delaying of code execution for some amount of time. The most common need for code delaying is when we're waiting for some other process to finish, so that we can work with the result of that process. In multi-threaded systems, a thread might want to wait for another thread to finish an operation, to continue working with that result.

Another example could be lessening the strain on a server we're working with. For example, while web scraping (ethically), and following the ToS of the website in question, abiding by the robots.txt file - you might very well want to delay the execution of each request as to not overwhelm the resources of the server.

Many requests, fired in rapid succession can, depending on the server in question, quickly take up all of the free connections and effectively become a DoS Attack. To allow for breathing space, as well as to make sure we don't negatively impact either the users of the website or the website itself - we'd limit the number of requests sent by delaying each one.

A student, waiting for exam results might furiously refresh their school's website, waiting for news. Alternatively, they might write a script that checks if the website has anything new on it. In a sense, code delay can technically become code scheduling with a valid loop and termination condition - assuming that the delay mechanism in place isn't blocking.

In this article, we'll take a look at how to delay code execution in Python - also known as sleeping. This can be done in a few ways:

Delaying Code with time.sleep()

One of the most common solutions to the problem is the sleep() function of the built-in time module. It accepts the number of seconds you'd like the process to sleep for - unlike many other languages that are based in milliseconds:

import datetime
import time

print(datetime.datetime.now().time())
time.sleep(5)
print(datetime.datetime.now().time())

This results in:

14:33:55.282626
14:34:00.287661

Quite clearly, we can see a 5s delay between the two print() statements, with a fairly high precision - down to the second decimal place. If you'd like to sleep for less than 1 second, you can easily pass non-whole numbers as well:

print(datetime.datetime.now().time())
time.sleep(0.25)
print(datetime.datetime.now().time())
14:46:16.198404
14:46:16.448840
print(datetime.datetime.now().time())
time.sleep(1.28)
print(datetime.datetime.now().time())
14:46:16.448911
14:46:17.730291

Though, keep in mind that with 2 decimal places, the sleep duration might not be exactly on spot, especially since it's hard to test, given the fact that the print() statements take some (variable) time to execute as well.

However, there's one major downside to the time.sleep() function, very noticeable in multi-threaded environments.

time.sleep() is blocking.

It seizes up the thread it's on and blocks it for the duration of the sleep. This makes it unfit for longer waiting times, as it clogs up the thread of the processor during that time period. Additionally, this make it unfit for Asynchronous and Reactive Applications, which oftentimes require real-time data and feedback.

Another thing to note about time.sleep() is the fact that you can't stop it. Once it starts, you can't externally cancel it without terminating the entire program or if you cause the sleep() method itself to throw an exception, which would halt it.

Asynchronous and Reactive Programming

Asynchronous Programming revolves around parallel execution - where a task can be executed and finish independent of the main flow.

In Synchronous Programming - if a Function A calls Function B, it stops execution until Function B finishes execution, after which Function A can resume.

In Asynchronous Programming - if a Function A calls Function B, regardless of its dependence of the result from Function B, both can execute at the same time, and if need be, wait for the other one to finish to utilize each other results.

Reactive Programming is a subset of Asynchronous Programming, which triggers code execution reactively, when data is presented, regardless of whether the function supposed to process it is already busy. Reactive Programming relies heavily on Message-Driven Architectures (where a message is typically an event or a command).

Both Asynchronous and Reactive applications are the ones that suffer greatly from blocking code - so using something like time.sleep() isn't a good fit for them. Let's take a look at some non-blocking code delay options.

Delaying Code with asyncio.sleep()

Asyncio is a Python library dedicated to writing concurrent code, and uses the async/await syntax, which might be familiar to developers who have used it in other languages.

Let's install the module via pip:

$ pip install asyncio

Once installed, we can import it into our script and rewrite our function:

import asyncio
async def main():
    print(datetime.datetime.now().time())
    await asyncio.sleep(5)
    print(datetime.datetime.now().time())

asyncio.run(main())

When working with asyncio, we mark functions that run asynchronously as async, and await the results of operations such as asyncio.sleep() that will be finished at some point in the future.

Similar to the previous example, this will print two times, 5 seconds apart:

17:23:33.708372
17:23:38.716501

Though, this doesn't really illustrate the advantage of using asyncio.sleep(). Let's rewrite the example to run a few tasks in parallel, where this distinction is a lot more clear:

import asyncio
import datetime

async def intense_task(id):
    await asyncio.sleep(5)
    print(id, 'Running some labor-intensive task at ', datetime.datetime.now().time())

async def main():
    await asyncio.gather(
        asyncio.create_task(intense_task(1)),
        asyncio.create_task(intense_task(2)),
        asyncio.create_task(intense_task(3))
    )

asyncio.run(main())

Here, we've got an async function, which simulates a labor-intensive task that takes 5 seconds to finish. Then, using asyncio, we create multiple tasks. Each task can run asynchronously, though, only if we call them asynchronously. If we were to run them sequentially, they'd also execute sequentially.

To call them in parallel, we use the gather() function, which, well, gathers the tasks and executes them:

1 Running some labor-intensive task at  17:35:21.068469
2 Running some labor-intensive task at  17:35:21.068469
3 Running some labor-intensive task at  17:35:21.068469

These are all executed at the same time, and the waiting time for the three of them isn't 15 seconds - it's 5.

On the other hand, if we were to tweak this code to use time.sleep() instead:

import asyncio
import datetime
import time

async def intense_task(id):
    time.sleep(5)
    print(id, 'Running some labor-intensive task at ', datetime.datetime.now().time())

async def main():
    await asyncio.gather(
        asyncio.create_task(intense_task(1)),
        asyncio.create_task(intense_task(2)),
        asyncio.create_task(intense_task(3))
    )

asyncio.run(main())

We'd be waiting for 5 seconds between each print() statement:

1 Running some labor-intensive task at  17:39:00.766275
2 Running some labor-intensive task at  17:39:05.773471
3 Running some labor-intensive task at  17:39:10.784743

Delaying Code with Timer

The Timer class is a Thread, that can run and execute operations only after a certain time period has passed. This behavior is exactly what we're looking for, though, it's a bit of an overkill to use Threads to delay code if you're not already working with a multi-threaded system.

The Timer class needs to start(), and can be halted via cancel(). Its constructor accepts an integer, denoting the number of seconds to wait before executing the second parameter - a function.

Let's make a function and execute it via a Timer:

from threading import Timer
import datetime

def f():
    print("Code to be executed after a delay at:", datetime.datetime.now().time())

print("Code to be executed immediately at:", datetime.datetime.now().time())
timer = Timer(3, f)
timer.start()

This results in:

Code to be executed immediately at: 19:47:20.032525
Code to be executed after a delay at: 19:47:23.036206

The cancel() method comes in really handy if we have multiple functions running, and we'd like to cancel the execution of a function, based on the results of another, or on another condition.

Let's write a function f(), which calls on both f2() and f3(). f2() is called as-is - and returns a random integer between 1 and 10, simulating the time it took to run that function.

f3() is called through a Timer and if the result of f2() is greater than 5, f3() is cancelled, whereas if f2() runs in the "expected" time of less than 5 - f3() runs after the timer ends:

from threading import Timer
import datetime
import random

def f():
    print("Executing f1 at", datetime.datetime.now().time())
    result = f2()
    timer = Timer(5, f3)
    timer.start()
    if(result > 5):
        print("Cancelling f3 since f2 resulted in", result)
        timer.cancel()

def f2():
    print("Executing f2 at", datetime.datetime.now().time())
    return random.randint(1, 10)

def f3():
    print("Executing f3 at", datetime.datetime.now().time())

f()

Running this code multiple times would look something along the lines of:

Executing f1 at 20:29:10.709578
Executing f2 at 20:29:10.709578
Cancelling f3 since f2 resulted in 9

Executing f1 at 20:29:14.178362
Executing f2 at 20:29:14.178362
Executing f3 at 20:29:19.182505

Delaying Code with Event

The Event class can be used to generate events. A single event can be "listened to" by multiple threads. The Event.wait() function blocks the thread it's on, unless the Event.isSet(). Once you set() an Event, all the threads that waited are awoken and the Event.wait() becomes non-blocking.

This can be used to synchronize threads - all of them pile up and wait() until a certain Event is set, after which, they can dictate their flow.

Let's create a waiter method and run it multiple times on different threads. Each waiter starts working at a certain time and checks if they're still on the hour every second, right before they take an order, which takes a second to fulfill. They'll be working until the Event is set - or rather, their working time is up.

Each waiter will have their own thread, while management resides in the main thread, and call when everyone can call home. Since they're feeling extra generous today, they'll cut the working time and let the waiters go home after 4 seconds of work:

import threading
import time
import datetime

def waiter(event, id):
    print(id, "Waiter started working at", datetime.datetime.now().time())
    event_flag = end_of_work.wait(1)
    while not end_of_work.isSet():
        print(id, "Waiter is taking order at", datetime.datetime.now().time())
        event.wait(1)
    if event_flag:
        print(id, "Waiter is going home at",  datetime.datetime.now().time())

end_of_work = threading.Event()

for id in range(1, 3):
    thread = threading.Thread(target=waiter, args=(end_of_work, id))
    thread.start()

end_of_work.wait(4)
end_of_work.set()
print("Some time passes, management was nice and cut the working hours short. It is now", datetime.datetime.now().time())

Running this code results in:

1 Waiter started working at 23:20:34.294844
2 Waiter started working at 23:20:34.295844
1 Waiter is taking order at 23:20:35.307072
2 Waiter is taking order at 23:20:35.307072
1 Waiter is taking order at 23:20:36.320314
2 Waiter is taking order at 23:20:36.320314
1 Waiter is taking order at 23:20:37.327528
2 Waiter is taking order at 23:20:37.327528
Some time passes, management was nice and cut the working hours short. It is now 23:20:38.310763

The end_of_work event was used here to sync up the two threads and control when they work and when not to, delaying the code execution by a set time between the checks.

Conclusion

In this guide, we've taken a look at several ways to delay code execution in Python - each applicable toa different context and requirement.

The regular time.sleep() method is pretty useful for most applications, though, it's not really optimal for long waiting times, isn't commonly used for simple scheduling and is blocking.

Using asyncio, we've got an asynchronous version of time.sleep() that we can await.

The Timer class delays code execution and can be cancelled if need be.

The Event class generates events that multiple threads can listen to and respond accordingly, delaying code execution until a certain event is set.

May 04, 2021 12:30 PM UTC


EuroPython

EuroPython 2021: Call for Sponsors

We're happy to announce our call for sponsors. Reach out to enthusiastic Python developers, users and professionals worldwide by presenting your company at this year’s EuroPython 2021 Online conference, from July 26 - Aug 1, 2021!

altEuroPython 2021 Sponsor Packages

Sponsoring EuroPython guarantees you highly targeted visibility and the opportunity to present yourself and your company in a professional and innovative environment.

We have adjusted our conference sponsor packages for the online format and lowered their prices, giving you an excellent opportunity to reach out to attendees from all around the world. You can run your own virtual rooms, text channels, run competitions, tutorials or give product presentations throughout the conference days.

Want to know more?

We have just published our sponsorship brochure for EuroPython 2021, with full details and demographics:

altEuroPython 2021 Sponsor Brochure

For a quick overview, you can also head over to our sponsor packages page. Feel free to contact us with any questions at sponsoring@europython.eu.

imageEuroPython 2021 Sponsor Packages (PDF)

Special offer for early bird sponsors

Sponsors who sign up before or on May 7, will receive a special 10% discount on the sponsor package price.

Become a sponsor and support EuroPython 2021 today!

You can sign up on the sponsor packages page.

Enjoy,
EuroPython 2021 Team
EuroPython Society
EuroPython 2021 Website

May 04, 2021 09:11 AM UTC


Django Weblog

Django security releases issued: 3.2.1, 3.1.9, and 2.2.21

In accordance with our security release policy, the Django team is issuing Django 3.2.1, Django 3.1.9, and Django 2.2.21. These release addresses the security issue detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2021-31542: Potential directory-traversal via uploaded files

MultiPartParser, UploadedFile, and FieldFile allowed directory-traversal via uploaded files with suitably crafted file names.

In order to mitigate this risk, stricter basename and path sanitation is now applied. Specifically, empty file names and paths with dot segments will be rejected.

This issue has low severity, according to the Django security policy.

Thank you to Jasu Viding for the report.

Affected supported versions

  • Django main branch
  • Django 3.2
  • Django 3.1
  • Django 2.2

Resolution

Patches to resolve the issue have been applied to Django's main branch and to the 3.2, 3.1, and 2.2 release branches. The patches may be obtained from the following changesets:

The following releases have been issued:

The PGP key ID used for these releases is Carlton Gibson: E17DF5C82B4F9D00.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

May 04, 2021 08:51 AM UTC


Python Pool

[Solved] ValueError: Setting an Array Element With A Sequence Easily

Introduction

In python, we have discussed many concepts and conversions. In this tutorial, we will be discussing the concept of setting an array element with a sequence. When we try to access some value with the right type but not the correct value, we encounter this type of error. In this tutorial, we will be discussing the concept of ValueError: setting an array element with a sequence in Python.

What is Value Error?

A ValueError occurs when a built-in operation or function receives an argument with the right type but an invalid value. A value is a piece of information that is stored within a certain object.

What Is Valueerror: Setting An Array Element With A Sequence?

In python, we often encounter the error as ValueError: setting an array element with a sequence is when we are working with the numpy library. This error usually occurs when the Numpy array is not in sequence.

What Causes Valueerror: Setting An Array Element With A Sequence?

Python always throws this error when you are trying to create an array with a not properly multi-dimensional list in shape. The second reason for this error is the type of content in the array. For example, define the integer array and inserting the float value in it.

Examples Causing Valueerror: Setting An Array Element With A Sequence

Here, we will be discussing the different types of causes through which this type of error gets generated:

1. Array Of A Different Dimension

Let us take an example, in which we are creating an array from the list with elements of different dimensions. In the code, you can see that you have created an array of two different dimensions, which will throw an error as ValueError: setting an array element with a sequence.

import numpy as np
print(np.array([[1, 2,], [3, 4, 5]],dtype = int))

Output:

Array Of A Different Dimension

Explanation:

Solution Of An Array Of A Different Dimension

If we try to make the length of both the arrays equal, then we will not encounter any error. So the code will work fine.

import numpy as np
print(np.array([[1, 2, 5], [3, 4, 5]],dtype = int))

Output:

Solution Of An Array Of A Different Dimension

Explanation:

Also, Read | [Solved] IndentationError: Expected An Indented Block Error

2. Different Type Of Elements In An Array

Let us take an example, in which we are creating an array from the list with elements of different data types. In the code, you can see that you have created an array of multiple data types values than the defined data type. If we do this, there will be an error generated as ValueError: setting an array element with a sequence.

import numpy as np
print(np.array([2.1, 2.2, "Ironman"], dtype=float))

Output:

Different Type Of Elements In An Array

Explanation:

Solution Of Different Type Of Elements In An Array

If we try to make the data type unrestricted, we should use dtype = object, which will help you remove the error.

import numpy as np
print(np.array([2.1, 2.2, "Ironman"], dtype=object))

Output:

Solution Of Different Type Of Elements In An Array

Explanation:

Also, Read | [Solved] TypeError: String Indices Must be Integers

3. Valueerror Setting An Array Element With A Sequence Pandas

In this example, we will be importing the pandas’ library. Then, we will be taking the input from the pandas dataframe function. After that, we will print the input. Then, we will update the value in the list and try to print we get an error.

import pandas as pd
output = pd.DataFrame(data = [[800.0]], columns=['Sold Count'], index=['Project1'])
print (output.loc['Project1', 'Sold Count'])

output.loc['Project1', 'Sold Count'] = [400.0]
print (output.loc['Project1', 'Sold Count'])

Output:

Valueerror Setting An Array Element With A Sequence Pandas

Solution Of Value Error From Pandas

If we dont want any error in the following code we need to make the data type as object.

import pandas as pd
output = pd.DataFrame(data = [[800.0]], columns=['Sold Count'], index=['Project1'])
print (output.loc['Project1', 'Sold Count'])

output['Sold Count'] = output['Sold Count'].astype(object)
output.loc['Project1','Sold Count'] = [1000.0,800.0]
print(output)

Output:

ValueError: Setting an Array Element With A Sequence Easily

Also, Read | How to Solve TypeError: ‘int’ object is not Subscriptable

4. ValueError Setting An Array Element With A Sequence in Sklearn

Sklearn is a famous python library that is used to execute machine learning methods on a dataset. From regression to clustering, this module has all methods which are needed.

Using these machine learning models over the 2D arrays can sometimes cause a huge ValueError in the code. If your 2D array is not uniform, i.e., if several elements in all the sub-arrays are not the same, it’ll throw an error.

Example Code –

import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

X = np.array([[-1, 1], [2, -1], [1, -1], [2]])
y = np.array([1, 2, 2, 1])

clf = make_pipeline(StandardScaler(), SVC(gamma='auto'))
clf.fit(X, y)

Here, the last element in the X array is of length 1, whereas all other elements are of length 2. This will cause the SVC() to throw an error ValueError – Setting an element with a sequence.

Solution –

The solution to this ValueError in Sklearn would be to make the length of arrays equal. In the following code, we’ve changed all the lengths to 2.

import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

X = np.array([[-1, 1], [2, -1], [1, -1], [2, 1]])
y = np.array([1, 2, 2, 1])

clf = make_pipeline(StandardScaler(), SVC(gamma='auto'))
clf.fit(X, y)

Also, Read | Invalid literal for int() with base 10 | Error and Resolution

5. ValueError Setting An Array Element With A Sequence in Tensorflow

In Tensorflow, the input shapes have to be correct to process the data. If the shape of every element in your array is not of equal length, you’ll get a ValueError.

Example Code –

import tensorflow as tf
import numpy as np

# Initialize two arrays
x1 = tf.constant([1,2,3,[4,1]])
x2 = tf.constant([5,6,7,8])

# Multiply
result = tf.multiply(x1, x2)
tf.print(result)

Here the last element of the x1 array has length 2. This causes the tf.multiple() to throw a ValueError.

Solution –

The only solution to fix this is to ensure that all of your array elements are of equal shape. The following example will help you understand it –

import tensorflow as tf
import numpy as np

# Initialize two arrays
x1 = tf.constant([1,2,3,1])
x2 = tf.constant([5,6,7,8])

# Multiply
result = tf.multiply(x1, x2)
tf.print(result)

6. ValueError Setting An Array Element With A Sequence in Keras

Similar error in Keras can be observed when an array with different lengths of elements are passed to regression models. As the input might be a mixture of ints and lists, this error may arise.

Example Code –

model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, y, epochs=150, batch_size=10)

>>> ValueError: setting an array element with a sequence.

Here the array X contains a mixture of integers and lists. Moreover, many elements in this array are not fully filled.

Solution –

The solution to this error would be flattening your array and reshaping it to the desired shape. The following transformation will help you to achieve it. keras.layers.Flatten and pd.Series.tolist() will help you to achieve it.

model = Sequential()

model.add(Flatten(input_shape=(2,2)))

model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model

X = X.tolist()

model.fit(X, y, epochs=150, batch_size=10)

Also, Read | How to solve Type error: a byte-like object is required not ‘str’

Conclusion

In this tutorial, we have learned about the concept of ValueError: setting an array element with a sequence in Python. We have seen what value Error is? And what is ValueError: setting an array element with a sequence? And what are the causes of Value Error? We have discussed all the ways causing the value Error: setting an array element with a sequence with their solutions. All the examples are explained in detail with the help of examples. You can use any of the functions according to your choice and your requirement in the program.

However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.

FAQs

1. How Does ValueError Save Us From Incorrect Data Processing?

We will understand this with the help of small code snippet:

while True:
    try:
        n = input("Please enter an integer: ")
        n = int(n)
        break
    except ValueError:
        print("No valid integer! Please try again ...")
print("Great, you successfully entered an integer!")

Input:

Firstly, we will pass 10.0 as an integer and then 10 as the input. Let us see what the output comes.

Output:

Solving ValueError: Setting an Array Element With A Sequence Easily

Now you can see in the code. When we try to enter the float value in place of an integer value, it shows me a value error which means you can enter only the integer value in the input. Through this, ValueError saves us from incorrect data processing as we can’t enter the wrong data or input.

2. We don’t declare a data type in python, then why is this error arrises in initializing incorrect datatype?

In python, We don’t have to declare a datatype. But, when the ValueError arises, that means there is an issue with the substance of the article you attempted to allocate the incentive to. This is not to be mistaken for types in Python. Hence, Python ValueError is raised when the capacity gets a contention of the right kind; however, it an unseemly worth it.

The post [Solved] ValueError: Setting an Array Element With A Sequence Easily appeared first on Python Pool.

May 04, 2021 05:14 AM UTC


Podcast.__init__

Data Exploration and Visualization Made Effortless with Lux

Data exploration is an important step in any analysis or machine learning project. Visualizing the data that you are working with makes that exploration faster and more effective, but having to remember and write all of the code to build a scatter plot or histogram is tedious and time consuming. In order to eliminate that friction Doris Lee helped create the Lux project, which wraps your Pandas data frame and automatically generates a set of visualizations without you having to lift a finger. In this episode she explains how Lux works under the hood, what inspired her to create it in the first place, and how it can help you create a better end result. The Lux project is a valuable addition to the toolbox of anyone who is doing data wrangling with Pandas.

Summary

Data exploration is an important step in any analysis or machine learning project. Visualizing the data that you are working with makes that exploration faster and more effective, but having to remember and write all of the code to build a scatter plot or histogram is tedious and time consuming. In order to eliminate that friction Doris Lee helped create the Lux project, which wraps your Pandas data frame and automatically generates a set of visualizations without you having to lift a finger. In this episode she explains how Lux works under the hood, what inspired her to create it in the first place, and how it can help you create a better end result. The Lux project is a valuable addition to the toolbox of anyone who is doing data wrangling with Pandas.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial.
  • Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch.
  • Your host as usual is Tobias Macey and today I’m interviewing Doris Lee about Lux, a Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Lux is and how the project got started?
  • What is the role of visualization in a data science workflow?
    • What are the challenges that data scientists face in the exploratory phase of their analysis?
  • There are a wide variety of data visualization tools in the Python ecosystem with differing areas of focus. What is the role of Lux in that ecosystem?
    • How does Lux compare to tools such as scikit-yb?
  • What is the workflow for someone using Lux in their analysis and what problems does it solve for them?
  • Can you talk through how Lux is architected?
    • How have the goals and design of Lux changed or evolved since you first began working on it?
  • Data visualization is a broad field. How do you determine which kinds of charts or plots are best suited to a particular data set or exploration?
  • What are some of the capabilities of Lux that are often overlooked or underutilized?
  • How has Lux impacted your own work in data analysis/data science?
  • What are some of the other gaps that you see in the available tooling for data science?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen Lux used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on and with Lux?
  • When is Lux the wrong choice?
  • What do you have planned for the future of the project?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

May 04, 2021 12:54 AM UTC


Read the Docs

Read the Docs newsletter - May 2021

Welcome to a new edition of our monthly newsletter, where we openly share the most relevant updates around Read the Docs, offer a summary of new features we shipped during the previous month, and share what we’ll be focusing on in the near future.

Company highlights

  • The team keeps growing! Ra will join us next week to do account management for EthicalAds.
  • We reworked our deploy procedure to almost remove the need to interrupt builds. Therefore, you should see less build retries during our deploy timeframe, normally Tuesday morning Pacific time.
  • After some careful deliberation, we started the process to deprecate recommonmark in favor of MyST, a better maintained alternative. We are excited to see that some projects have already migrated without effort, and we are looking forward to helping MyST-Parser thrive! Thanks to the Executable Books Project folks for creating this project.
  • Our frontend developer position is still open! We are actively looking for candidates, if you know people that could potentially be interested feel free to forward them the link above.

New features

You can always see the latest changes to our platforms in our Read the Docs Changelog and Ethical Ad Server Changelog.

Current focus & known issues

  • We are working with the Sphinx maintainers to help testing the upcoming Sphinx 4.0.0 release and decrease the risk of breakage. It is an ongoing effort that requires finding a balance between stability and maintainability, and we hope we can do the best for our users and the Sphinx community. Read the release notes to check for any breaking changes, and if in doubt, pin your declared dependency to <4.
  • Cloudflare has changed how their SSL works, so we’re figuring out how that might impact users of custom domains on our Community site. It will likely only impact projects that are proxying to us, not domains that follow our recommended custom domain configuration.

Upcoming features

  • Anthony will keep working on releasing sphinx_rtd_theme 1.0, start getting users to test our new user interface, and iron out our Data Processing Agreements.
  • On the EthicalAds side, David will be improving the KPI reporting for our publishers, and onboarding Ra on the team along with Eric.
  • Eric is focused on onboarding our new hire on the EthicalAds team, finishing a CZI grant proposal for funding in 2021-2022, and figuring out how the team can handle growing from 5 to 8 folks in 2021!
  • Juan Luis will work with Eric on the new CZI proposal, continue discussing with the Sphinx community about the new tutorial, and have more Customer Development calls with existing users.
  • Manuel will continue improving our operations and deployment procedures, make single sign-on discoverable by users, and release a new version of sphinx-hoverxref compatible with newer versions of Sphinx and MathJax.
  • And finally, Santos will wrap up our new infrastructure and configuration changes allowing users to install custom apt packages.

Considering using Read the Docs for your next Sphinx or MkDocs project? Check out our documentation to get started!

May 04, 2021 12:00 AM UTC

May 03, 2021


Mike Driscoll

An Overview of Image Processing with Python and Pillow (Video)

Learn how to edit and enhance photos using Pillow and the Python programming language.

What you’ll learn in this video:

The post An Overview of Image Processing with Python and Pillow (Video) appeared first on Mouse Vs Python.

May 03, 2021 11:00 PM UTC


Python Insider

Python 3.8.10, 3.9.5, and 3.10.0b1 are now available

This has been a very busy day for releases and on behalf of the Python development community we’re happy to announce the availability of three new Python releases.

Python 3.10 is now in Beta

Get it here: https://www.python.org/downloads/release/python-3100b1/

Python 3.10 is still in development. 3.10.0b1 is the first of four planned beta release previews. Beta release previews are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release.

We strongly encourage maintainers of third-party Python projects to test with 3.10 during the beta phase and report issues found to the Python bug tracker as soon as possible. While the release is planned to be feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (Monday, 2021-08-02). Our goal is have no ABI changes after beta 4 and as few code changes as possible after 3.10.0rc1, the first release candidate. To achieve that, it will be extremely important to get as much exposure for 3.10 as possible during the beta phase.

Please keep in mind that this is a preview release and its use is not recommended for production environments.

The next pre-release, the second beta release of Python 3.10, will be 3.10.0b2. It is currently scheduled for 2021-05-25. Please see PEP 619 for details.

Development Begins on Python 3.11

With Python 3.10 moving to beta, it received its own 3.10 branch in the repository. All new features are now targeting Python 3.11, to be released in October 2022.

Using the opportunity with the creation of the 3.10 branch, we renamed the master branch of the repository to main. It’s been a bit rocky but looks like we’re open for business. Please rename the main branch of your personal fork using the guide GitHub will give you when you go to your fork’s main page. In case of any outstanding issues, please contact the 3.11 RM.

Python 3.9.5

Get it here: https://www.python.org/downloads/release/python-395/

Python 3.9.5 is the newest major stable release of the Python programming language, and it contains many new features and optimizations. There’s been 111 commits since 3.9.4 which is a similar amount compared to 3.8 at the same stage of the release cycle. See the change log for details.

On macOS, we encourage you to use the universal2 binary installer variant whenever possible. The legacy 10.9+ Intel-only variant will not be provided for Python 3.10 and the universal2 variant will become the default download for future 3.9.x releases. You may need to upgrade third-party components, like pip, to later versions once they are released. You may experience differences in behavior in IDLE and other Tk-based applications due to using the newer version of Tk. As always, if you encounter problems when using this installer variant, please check https://bugs.python.org for existing reports and for opening new issues.

The next Python 3.9 maintenance release will be 3.9.6, currently scheduled for 2021-06-28.

The Last Regular Bugfix Release of Python 3.8

Get it here: https://www.python.org/downloads/release/python-3810/

According to the release calendar specified in PEP 569, Python 3.8.10 is the final regular maintenance release. Starting now, the 3.8 branch will only accept security fixes and releases of those will be made in source-only form until October 2024. To keep receiving regular bug fixes, please upgrade to Python 3.9.

Compared to the 3.7 series, this last regular bugfix release is relatively dormant at 92 commits since 3.8.9. Version 3.7.8, the final regular bugfix release of Python 3.7, included 187 commits. But there’s a bunch of important updates here regardless, the biggest being macOS Big Sur and Apple Silicon build support. This work would not have been possible without the effort of Ronald Oussoren, Ned Deily, Maxime Bélanger, and Lawrence D’Anna from Apple. Thank you!

Take a look at the change log for details.

We hope you enjoy the new releases

Your friendly release team,
Ned Deily @nad
Steve Dower @steve.dower
Pablo Galindo Salgado @pablogsal
Łukasz Langa @ambv

May 03, 2021 07:31 PM UTC