Python: A Quick Visualization Reference

Just some simple plots for reference
visualization
Spread the love
  • 2
  •  
  •  
  •  
  •  
  •  
  •  
  •  
    2
    Shares

A quick guide to building simple charts with matplotlib. I intend to update a later time with other python visualization packages like seaborn. 

Visualizing Data

A quick guide to creating plots in python

Line Chart

In [1]:
from matplotlib import pyplot as plt

years = [1950, 1960, 1970, 1980, 1990, 2000, 2010]
gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3]

### Create a simple line chart
plt.plot(years, gdp, c="g", marker="o", linestyle="solid")

### Add a title
plt.title("Nominal GDP")

### Add a label to the y-axis
plt.ylabel("Billions of $")
plt.show()
In [2]:
### Create a variance list in which the variance doubles each time
variance = [1]
x = 1
while x < 255:
    x = x * 2
    variance.append(x)

### Create a bias squared
bias_squared = []
while x > 0:
    bias_squared.append(x)
    x = x // 2

total_error = [x + y for x, y in zip(variance, bias_squared)]
xs = [i for i, _ in enumerate(variance)]

plt.plot(xs, variance, 'g-', label='variance') # green dashed line
plt.plot(xs, bias_squared, 'r-.', label='bias^2') # red dot-dashed line
plt.plot(xs, total_error, 'b:', label='total error') # blue dotted line
plt.legend(loc=9) # 9 = top center
plt.xlabel("Model Complexity")
plt.title("The Bias-Variance Trade-Off")
plt.show()

Barchart

In [3]:
movies = ["Annie Hall", "Ben-Hur", "Casablanca", "Gandhi", "West Side Story"] 
num_oscars = [5, 11, 3, 8, 10]

### Bars default width is 0.8, to center we need to add 0.1 to the left coordinates
xs = [i for i, _ in enumerate(movies)]

### Plot the bars with left x-coordinates [xs], heights [num_oscars]
plt.bar(xs, num_oscars)

plt.ylabel("Number of Oscars")
plt.title("Movies")

### Label X axis with movie names
plt.xticks([i for i, _ in enumerate(movies)], movies)

plt.show()

Histograms

In [4]:
from collections import Counter

grades = [83,95,91,87,70,0,85,82,100,67,73,77,0]
### A lambda function is a small anonymous function
### This following function could also be written as
### def decile(grade):
###    grade = grade // 10 * 10
decile = lambda grade: grade // 10 * 10

### We now can create buckets by first
### Using decile lambda function to take a grade (i.e. 84) 
### and using integer division '//' (i.e. 84 // 10 = 8) * 10 = 80
### to assign it to a bucket that is counted by the counter function
histogram = Counter(decile(grade) for grade in grades)

### Plot the histogram
plt.bar([x for x in histogram.keys()], histogram.values(), width=8)

### Add axis values
plt.axis([-5, 105, 0, 5])
plt.xticks([10 * i for i in range(11)])
plt.xlabel("Decile")
plt.ylabel("# of Students")
plt.title("Distribution of Student Grades")
plt.show()

Scatterplots

In [5]:
friends = [ 70,  65,  72,  63,  71,  64,  60,  64,  67]
minutes = [175, 170, 205, 120, 220, 130, 105, 145, 190]
labels =  ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

plt.scatter(friends, minutes)

### label each point
for label, friend_count, minute_count in zip(labels, friends, minutes):
    plt.annotate(label,
                xy=(friend_count, minute_count),
                xytext=(5, -5),
                textcoords='offset points')

plt.title("Daily Minutes vs Number of Friends")
plt.xlabel("# of friends")
plt.ylabel("daily minutes spent on the site")
plt.show()
In [6]:
### Letting Matplotlib choose the scale

test_1_grades = [ 99, 90, 85, 97, 80] 
test_2_grades = [100, 85, 60, 90, 70]

plt.scatter(test_1_grades, test_2_grades) 
plt.title("Axes Aren't Comparable") 
plt.xlabel("test 1 grade") 
plt.ylabel("test 2 grade")
plt.axis("equal")
plt.show()
ML_CH02
0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
WP2Social Auto Publish Powered By : XYZScripts.com
0
Would love your thoughts, please comment.x
()
x
%d bloggers like this: