Pandas: Create matplotlib plot with x-axis label not index
I’ve been using matplotlib a bit recently, and wanted to share a lesson I learnt about choosing the label of the x-axis. Let’s first import the libraries we’ll use in this post:
import pandas as pd
import matplotlib.pyplot as plt
And now we’ll create a DataFrame of values that we want to chart:
df = pd.DataFrame({
"name": ["Mark", "Arya", "Praveena"],
"age": [34, 1, 31]
})
df
This is what our DataFrame looks like:
name age
0 Mark 34
1 Arya 31
2 Praveena 1
df.plot.bar()
plt.tight_layout()
plt.show()
If we run that code we’ll see this chart:
The chart itself looks fine, but the labels of the values on the x-axis are a bit weird.
They’re 1, 2, and 3, whereas we want them to use the values in the name
column of our DataFrame.
I was a bit confused at first, but eventually realised that they were the index values of our rows. We can see that by executing the following code:
>>> df.index.values
array([0, 1, 2])
There are a couple of ways that we can fix our chart.
The first is to use the name
column as our index, an approach I learnt from Josh Devlin’s blog post.
We can reset the index by running the following code:
df.set_index("name",drop=True,inplace=True)
Let’s check the index values:
>>> df.index.values
array(['Mark', 'Arya', 'Praveena'], dtype=object)
Ah, much better! Now we can plot our chart again:
df.plot.bar()
plt.tight_layout()
plt.show()
If we run that code we’ll see this chart:
That’s much better!
We can also achieve the same outcome by specifying the x
parameter when we call the bar
function:
df = pd.DataFrame({
"name": ["Mark", "Arya", "Praveena"],
"age": [34, 1, 31]
})
df.plot.bar(x="name")
plt.tight_layout()
plt.show()
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.