Histogram with python
A histogram displays the distribution of data over a continuous interval or specific time period. The height of each bar in a histogram indicates the frequency of data points within the interval/bin. It’s a great tool to identify where values are concentrated, or if there are extreme values or gaps in the dataset.
More about: Histogram
Histogram
# import libraries
import matplotlib.pyplot as plt
import pandas as pd
plt.style.use(['unhcrpyplotstyle','histogram'])
#load data set
df = pd.read_csv('https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/histogram.csv')
#compute data array for plotting
x = df['poc_age']
num_bins = 25
#plot the chart
fig, ax = plt.subplots()
histo = ax.hist(x, num_bins)
#set x,y axis limits
xl = plt.xlim(0,100)
yl = plt.ylim(0,35)
#set chart title
ax.set_title('Age distribution | 2020')
#set axis label
ax.set_ylabel('Number of people')
ax.set_xlabel('Age')
#set chart source and copyright
plt.annotate('Source: UNHCR Refugee Data Finder', (0,0), (0, -25), xycoords='axes fraction', textcoords='offset points', va='top', color = '#666666', fontsize=9)
plt.annotate('©UNHCR, The UN Refugee Agency', (0,0), (0, -35), xycoords='axes fraction', textcoords='offset points', va='top', color = '#666666', fontsize=9)
#adjust chart margin and layout
fig.tight_layout()
#show chart
plt.show()