This tutorial will show how you can plot bar charts using Python with detailed examples.
Similar to line charts, bar charts show the relationship between X (on x-asix) and Y (on Y-asix). I will first use the same data as in line charts to illustrate how to plot bar charts. Then, I will use another data to show the different usage cases between line charts and bar charts.
Example 1: A simple example
The following includes two parts of code showing how to plot a bar chart in Python. The first part of code use NumPy to generate the data for X and Y. In particular, it specifies the relationship as Y = X2.
import numpy as np
import pandas as pd
x_simple=np.linspace(0, 20, 10)
y_simple=x_simple*x_simple
d = {'x_simple': x_simple, 'y_simple': y_simple}
pd_df=pd.DataFrame(data=d)
print(pd_df)
x_simple y_simple 0 0.000000 0.000000 1 2.222222 4.938272 2 4.444444 19.753086 3 6.666667 44.444444 4 8.888889 79.012346 5 11.111111 123.456790 6 13.333333 177.777778 7 15.555556 241.975309 8 17.777778 316.049383 9 20.000000 400.000000
import numpy as np
x_simple=np.linspace(0, 20, 10)
y_simple=x_simple*x_simple
import matplotlib.pyplot as plt
plt.bar(x_simple, y_simple)
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
Example 2: How to plot stock fundamentals using bar charts
In this section, I will show you how to use Python to do stock fundemental analysis.
The data set below includes multiple columns of data, namely RD Expenses, Sales and Marketing, and General Admin Expenses. We can plot them into a same bar chart, namely put all 3 on the Y-axis, whereas Quarter column on the X-axis. The following download the data from Github and print it out.
import pandas as pd
MSFT_data=pd.read_csv("https://raw.githubusercontent.com/TidyPython/data_visualization/main/data_MSFT_T.csv")
print(MSFT_data)
Quarter RD Expenses Sales and Marketing General Admin Expenses 0 2017Q1 3355 3879 1202 1 2017Q2 3514 4356 1355 2 2017Q3 3574 3812 1166 3 2017Q4 3504 4562 1109 4 2018Q1 3715 4335 1208 5 2018Q2 3933 4760 1271 6 2018Q3 3977 4098 1149 7 2018Q4 4070 4588 1132 8 2019Q1 4316 4565 1179 9 2019Q2 4513 4962 1425 10 2019Q3 4565 4337 1061 11 2019Q4 4603 4933 1121 12 2020Q1 4887 4911 1273 13 2020Q2 5214 5417 1656 14 2020Q3 4926 4231 1119 15 2020Q4 4899 4947 1139 16 2021Q1 5204 5082 1327 17 2021Q2 5687 5857 1522 18 2021Q3 5599 4547 1287 19 2021Q4 5758 5379 1384
import matplotlib.pyplot as plt
plt.bar('Quarter', 'RD Expenses',data=MSFT_data)
plt.gca().xaxis.set_major_locator(plt.MultipleLocator(3))
plt.xlabel("Quarter")
plt.ylabel("RD Expenses")
plt.show()
Besides R&D, we can see there are another two columns, namely Sales and Marketing and General Admin Expenses. We can plot them into the same chart as well.
The following is the complete Python code and figure outout and figure output. When looking at the code below, you should notice that the following code line does not use plt, which is directly from the package of matplotlib. Thus, the code line below is using Pandas’s function of pandas.DataFrame.plot
, which is built on the top of matplotlib. That is why in the end, you have to include “plt.show()
.”
import pandas as pd
import matplotlib.pyplot as plt
MSFT_data=pd.read_csv("https://raw.githubusercontent.com/TidyPython/data_visualization/main/data_MSFT_T.csv")
MSFT_data.plot(x='Quarter', kind='bar', stacked=False,)
plt.show()