Estimated reading time: 4 minutes
Taking it further with a python excel chart.
We have created some video content here using Python with Excel that illustrates the different ways you can leverage Python to
- Data cleanse
- Find unwanted characters
- And see if the file is empty before import!
What this video is giving as output
Here we are looking to introduce charts in Excel, and how how to use Python to easily work with your data and export to an excel sheet.
Below is the final output of our two charts, this is for illustrative purposes, and is taken from the Irish Government’s website as at 1st May 2020, importing the cell ranges associated with them.
How we went about it
Step 1 – Importing the data
import pandas as pd
import openpyxl
importfile = openpyxl.load_workbook("C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19CountyStatisticsHPSCIreland01052020.xlsx")
#Telling the program to go to each individual sheet
Dublin = importfile["Dublin"]
Cork = importfile["Cork"]
Limerick = importfile["Limerick"]
Galway = importfile["Galway"]
#Telling the program for each sheet , create a dataframe.
dfDublin = pd.DataFrame(Dublin.values)
dfCork = pd.DataFrame(Cork.values)
dfLimerick = pd.DataFrame(Limerick.values)
dfGalway = pd.DataFrame(Galway.values)
Step 2 – Splitting the different sheets out into different data frames
### Showing to break down here, further on I cocatenate the dataframes ###
#This brings in all the values from the sheet into a variable so they can be read.
datadublin = Dublin.values
datacork = Cork.values
datalimerick = Limerick.values
datagalway = Galway.values
#This tells the program that the headers are on row 0.
columnsdublin = next(datadublin)[0:]
columnscork = next(datacork)[0:]
columnslimerick = next(datalimerick)[0:]
columnsgalway = next(datagalway)[0:]
# The imported file has the column headings on row 0, moving them to be the actual column headings instead of numerical values.
dfDublin = pd.DataFrame(datadublin, columns=columnsdublin)
dfCork = pd.DataFrame(datacork, columns=columnscork)
dfLimerick = pd.DataFrame(datalimerick, columns=columnslimerick)
dfGalway = pd.DataFrame(datagalway, columns=columnsgalway)
Step 3 – Exports all three data frames to a separate tab each to the below file
# This starts the process of where to save the output by creating a writer variable
writer = pd.ExcelWriter('C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')
# Save the dataframe "Dublin" to Dublin sheet etc...
dfDublin.to_excel(writer, 'Dublin')
dfCork.to_excel(writer, 'Cork')
dfLimerick.to_excel(writer, 'Limerick')
dfGalway.to_excel(writer, 'Galway')
# save the excel file
writer.save()
Step 4 – Merging all the data to allow the charts to be created easily
mergedPDList = [dfDublin,dfCork,dfLimerick,dfGalway] # Creating a new merged list of the dataframes, easier when doing charts below
mergeddf = pd.concat(mergedPDList) # physical merging of the dataframes
Step 5 – Creating a bar chart
import matplotlib.pyplot as plt
import matplotlib.pyplot as pltline
import seaborn as sns
#from openpyxl.drawing.image import Image
#This makes the seaborn library over ride the matplotlib properites for styling
sns.set()
plt.figure(figsize =(9,6))
plt.bar(x = mergeddf['CountyName'],
height = mergeddf['ConfirmedCovidCases']
)
plt.xticks(rotation = 45)
plt.title("No of Confirmed Covid Cases in the Republic of Ireland", fontsize=25, fontweight='bold')
plt.xlabel('Area')
plt.ylabel('No of cases')
plt.width = 35
img = plt.savefig('C://dataanalyticsireland/Python Code/youtube/codeused/files/barchart.png',bbox_inches='tight')
plt.show()
Step 6 – Creating a line chart
x = dfDublin['ConfirmedCovidCases']
x1 = dfCork['ConfirmedCovidCases']
x2 = dfLimerick['ConfirmedCovidCases']
x3 = dfGalway['ConfirmedCovidCases']
#Sets the size of the output before it is created
pltline.figure(figsize=(15,15))
pltline.plot(x)
pltline.plot(x1)
pltline.plot(x2)
pltline.plot(x3)
pltline.legend(['Dublin', 'Cork','Limerick','Galway'], fontsize="x-large")
pltline.ylabel('No of cases')
pltline.xlabel('No of days elapsed')
img1 = plt.savefig('C://dataanalyticsireland/Python Code/youtube/codeused/files/linegraph.png')
pltline.show()
Step 7 – Exporting charts to the XLSX file
from openpyxl import load_workbook
#from openpyxl.drawing.image import Image
import openpyxl
#This step opens the existing file.
wsnew = load_workbook('C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')
#This step creates a new worksheet called Barchart
newsheet = wsnew.create_sheet("Bar Chart",4) #Creates a new sheet in fourth position
newsheet1 = wsnew.create_sheet("Line Graph",5) #Creates a new sheet in fifth position
#Adding the image to Newsheet which is called "Barchart" in the excel file
img = openpyxl.drawing.image.Image('C://dataanalyticsireland/Python Code/youtube/codeused/files/barchart.png')
img1 = openpyxl.drawing.image.Image('C://dataanalyticsireland/Python Code/youtube/codeused/files/linegraph.png')
img.anchor = 'B2' # Tells what cell to put the imahe in
img1.anchor = 'B2' # Tells what cell to put the imahe in
newsheet.add_image(img) # Adds the image
newsheet1.add_image(img1) #Adds the image
wsnew.save('C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')
wsnew.close()
In the below video on creating a python excel chart, we have approached this as follows:
- Created four separate data frames, they are the four regions that will feed into the creation of the graphs.
- Separate to this, we merged the four data frames into one, to use with the bar chart.
And to finish off
- We created matplotlib charts and
- We exported the file to Excel using openpyxl.
If you like what this video has explained, please click to see our YouTube channel for more informative videos.
Data Analytics Ireland