Python and Visualizing Data 2018

Ahoy Python Workshop’eneers! Time to head to uncharted waters with our data and attempt to graph it!

We will start by installing Seaborn through our command line/terminal:

pip install seaborn
pip install git+

Doing so will install all the requirments and Seaborn, notably “pandas” and “numpy.” Python’s Pandas is actually not pronounced like the cuddly herbivorous bear:

Image result for panda and pythonBut instead:

Image result for pan+DAS logo

And it stands for “panel data” and has a wide range of other uses. We will also install “matplotlib” which is the math plotting library for Python:

pip install matplotlib

We could forgo Seaborn and use those three packages on their own (matplotlib, pandas, and numpy) but Seaborn provides us with prettier graphs and a more streamlined way to interact with our data.

With the installations out of the way, we will now start scripting!

Let’s start by bringing in Pandas for handling the data. For these imports we will be using an alias, so that we can type pd  instead of typing in “pandas” each time we want to use pandas.

# use pandas for data frame
import pandas as pd

Now we’ll bring in matplotlib to customize our graphs:

from matplotlib import pyplot as plt

Our last import will be the Seaborn module:

import seaborn as sns

Now let’s bring in our data set:

df = pd.read_csv('average_temperature_2017_to_2017.csv')

Here we named a variable called df  and used read_csv to open the “average_temperature_2017_to_2017.csv” csv file from our previous session.

Let’s look at our data using the head() command:

result = df.head()
print "{}".format()

What this does is pull the first 5 records of our dataset to see how it looks.

Using Pandas we can also do neat stuff like look at descriptive statistics of the data:

result = df.describe()
print "{}".format(result)

Part 2 Time to Chart Our Data!

Now that we have a good grasp of how our data looks, we are going to create a neat little chart representing it.

We will first sort the data to go from largest to smallest by setting ascending to “False” on the field called “Number”:

result = df.sort_values(by='Avg',ascending=False)

Now we will finally use Seaborn to graph the data:

sns.lineplot(x='Year-Month', y='Avg', data=result)

The syntax is pretty straightforward, where sns is Seaborn, lineplot and chart type.

x = is the X-Axis, y= is the Y-Axis, and data=result selects the data.

If you run your code now… nothing will happen (unless you are using a Jupyter notebook)

What we need to add is:

And there you arrrrrre! Your chart should appear:

If you got lost along the way, here is how your code should look:

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

df = pd.read_csv('average_temperature_2017_to_2017.csv')

result = df.sort_values(by='Avg',ascending=False)

sns.lineplot(x='Year-Month', y='Avg', data=result)

Back to the workshop