· python matplotlib

matplotlib - Create a histogram/bar chart for ratings/full numbers

In my continued work with matplotlib I wanted to plot a histogram (or bar chart) for a bunch of star ratings to see how they were distributed.

Before we do anything let’s import matplotlib as well as pandas:

import random
import pandas as pd

import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')

Next we’ll create an array of randomly chosen star ratings between 1 and 5:

stars = pd.Series([random.randint(1, 5) for _ in range(0, 100)])

We want to plot a histogram showing the proportion for each rating. The following code will plot a chart and store it in an SVG file:

_, ax1 = plt.subplots()
ax1.hist(stars, 5)
plt.tight_layout()
plt.savefig("/tmp/hist.svg")
plt.close()

This is what the chart looks like:

hist

This is ok, but the labels on the x axis are a bit weird - the value for each rating doesn’t align with the corresponding bar. I came across this StackOverflow post, which shows how to solve this problem by using a bar chart instead. I ended up with this code:

_, ax2 = plt.subplots()

stars_histogram = stars.value_counts().sort_index()
stars_histogram /= float(stars_histogram.sum())
stars_histogram *= 100

stars_histogram.plot(kind="bar", width=1.0)
plt.tight_layout()
plt.savefig("/tmp/bar.svg")
plt.close()

This is what the chart looks like now:

bar

Much better!

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket