How to Create Multiple XML Files From Excel With Python

So you have an Excel file that has hundreds or maybe thousands of records, but you want to convert it to an XML file, and have each row as a separate file?

This can be achieved very easily using Python, in three easy steps.

First of all, what is XML?

XML stands for “extensible markup language”, and is used extensively with websites, Webservices and where companies want to transfer data between each other in a predefined format that follows a recognised standard.

To understand more about what is xml, follow the link. One thing to note is that it may reference the financial services industry, but XML is used widely across multiple industries too, as we are now digitising a lot of our processes, the transfer of data needs to follow a standard that everyone can work with seamlessly.

Step 1 – Let’s look at our raw data

Below is an excel file with three columns and eight rows. The objective is to create eight separate XML files for each row.

raw data to populate multiple excel files

Step 2 – Import the data and create a master XML file

Below is the logic I used in How to Create an XML File from Excel using Python, I have attached comments where necessary to explain what each line is doing.

The purpose at this point is to create one file with all the data from excel called output.xml. We will then use this to create the separate files we need and then delete it.

The next steps will take you through that.

import pandas as pd
from lxml import etree as et
import os

# STEP 1 - This step creates a master XML file

raw_data = pd.read_excel(r'\Users\haugh\OneDrive\dataanalyticsireland\YOUTUBE\how_to_read_each_excel_row_into_a_separate_XML_file\excel_raw_data_city.xlsx')

root = et.Element('root')

for row in raw_data.iterrows(): #==> This is a loop that takes runs through each record and populates for each tag.
    root_tags = et.SubElement(root, 'ExportData') #=== > Root name
# These are the tag names for each row (SECTION 1)
    Column_heading_1 = et.SubElement(root_tags, 'City')
    Column_heading_2 = et.SubElement(root_tags, 'Area')
    Column_heading_3 = et.SubElement(root_tags, 'Population')

###These are the values that will be populated for each row above
# The values inside the [] are the raw file column headings.(SECTION 2)
    Column_heading_1.text = str(row[1]['City'])
    Column_heading_2.text = str(row[1]['Area'])
    Column_heading_3.text = str(row[1]['Population'])

# This Section outputs the data to an xml file
# Unless you tell it otherwise it saves it to the same folder as the script.
tree = et.ElementTree(root) #==> The variable tree is to hold all the values of "root"
et.indent(tree, space="\t", level=0) #===> This just formats in a way that the XML is readable
tree.write('output.xml', encoding="utf-8") #==> The data is saved to an XML file

Step 3 – Loop through each row and create the separate files

In the below code, it opens the output.xml file and using the loop looks for all the data between the tags “ExportData”.

Then it copies them and then creates a new file.

Once one is done, it moves to the next set of tags and repeats the process, till it reaches the end of the output.xml file.

master_file = et.iterparse('output.xml', events=('end', ))
index = 0
for i, j in master_file:
    if j.tag == 'ExportData':
        index += 1
        filename = format(str(index) + ".xml")
        with open(filename, 'wb') as f:

Step 4 – Delete the output.xml file – purely optional

The objective here was to create separate XML files, based on the input of the original Excel file.

We have also created an additional file called output.xml to get us to the end goal.

If we have no need for this file then it can be removed with the below code, be sure that it is at the very end of your logic.

The import statement for os.unlink is in step two.


How to Create an XML file from Excel using Python

Estimated reading time: 3 minutes

Are you working on a data analytics project where you need to feed your data to a location that is able to process an XML file?

The ability to get your data into a structured format like XML has many benefits:

(A) You can transfer the data to a web service for processing.

(B) Multiple different formats of your raw data can be standardised, enabling quick conversion and processing.

(C) XML files can be read by multiple different programs, as long as you deliver them in the correct format.

(D) The receiver of data can easily read the XML file and store it on their database.

The ability to use this method to read and transfer data is a very powerful way to help a data analyst process large data sets.

In fact, if you are using cloud-based applications to analyse the information you are storing, this will quickly enable you to deliver the data.

What packages do I need in Python?

The first step is to import the following:

import pandas as pd
from lxml import etree as et

Next we want to read in the source data

In this instance, we are reading an excel file

raw_data = pd.read_excel(r'Link to where your data is stored including the full file name')

Now we want to start building the XML structure

The FIRST STEP is to define the root

root = et.Element('root')

The root is the parent of all the data items (tags) contained in the XML file and is needed as part of the structure

The SECOND STEP is to define the tag names that will store each row of the source data

for row in raw_data.iterrows(): ==> This is a loop that takes runs through each record and populates for each tag.
    root_tags = et.SubElement(root, 'ExportData') #=== > Root name
# These are the tag names for each row (SECTION 1)
    Column_heading_1 = et.SubElement(root_tags, 'Name')
    Column_heading_2 = et.SubElement(root_tags, 'Area')
    Column_heading_3 = et.SubElement(root_tags, 'NoPurchases')
    Column_heading_4 = et.SubElement(root_tags, 'Active')

###These are the values that will be populated for each row above
# The values inside the [] are the raw file column headings.(SECTION 2)
    Column_heading_1.text = str(row[1]['Name'])
    Column_heading_2.text = str(row[1]['Area'])
    Column_heading_3.text = str(row[1]['No Purchases'])
    Column_heading_4.text = str(row[1]['Active'])

The raw file looks like this:

NameAreaNo PurchasesActive
Raw file data that will be imported into the XML file

The THIRD STEP is to create the XML file and populate it with the data from the source file

# This Section outputs the data to an xml file
# Unless you tell it otherwise it saves it to the same folder as the script.
tree = et.ElementTree(root) ==> The variable tree is to hold all the values of "root"
et.indent(tree, space="\t", level=0) ===> This just formats in a way that the XML is readable
tree.write('output.xml', encoding="utf-8") ==> The data is saved to an XML file

The XML output should look like the below


Additional XML data can be added

  1. Add more rows – All you need to do is add onto the source file and save. When you rerun the logic it will read in the extra information.
  2. Add more columns – All you need to do is go to the second step above add in a tag name to SECTION 1. Seperately you will need to add an additional column with data to the source file, and then add that column name to SECTION 2 as well

How to Add Formulas to Excel using Python

Estimated reading time: 3 minutes

You may be working on automating some exports to Excel using Python to compare files or just pure and simple adding formulas to an Excel file before you open it up.

Here we explain adding formulas to your Excel output using Numpy or adding the calculations to specific cells in the output.

Adding formulas to specific cells

First of all, let’s look at the normal spreadsheet with some calculations, these have the formulas typed in. The ultimate objective is to have the Python code do this for us, one less step.

As can be seen, the cells have the formulas in them, but this would be a very time-consuming process if you had to do it multiple times, in multiple spreadsheets.

To get around this we can write the Python logic as follows:

  1. Create three lists and three dataframes as follows.
datasetA_list = np.array([1,2,3,4,5,6,7,8,9,10])

datasetB_list = np.array([9,8,65,43,3,21,3,2,1,7])

dataset_list = ('sum','average','median','standard deviation','count','correlation')

datasetA = pd.DataFrame(datasetA_list,columns=['ValueA'])
datasetB = pd.DataFrame(datasetB_list,columns=['ValueB'])
dataset_list_calcs = pd.DataFrame(dataset_list, columns=['Calcs'])

2. Next create a path to where you are going to store the data as follows:

path = 'output.xlsx'

3. In this next step create the workbook and location where the data will be stored. This will load the headings created in step 1 to a particular location on the spreadsheet.

workbook = pd.ExcelWriter(path, engine='openpyxl') = load_workbook(path)
workbook.sheets = dict((ws.title,ws) for ws in

datasetA.to_excel(workbook,sheet_name="Sheet1", startrow=1,index=False, header=True,)
datasetB.to_excel(workbook,sheet_name="Sheet1", startrow=1, startcol=2,index=False, header=True)
dataset_list_calcs.to_excel(workbook,sheet_name="Sheet1", startrow=1, startcol=4,index=False, header=True)

4. Load the formulas into cells besides their relevant headings. This should line post these formulas beside the relevant heading created in step 1.

###Creating calculations for datasetA

sheet = workbook.sheets['Sheet1']
sheet['E2'] = 'CalcsA'
sheet['F3'] = '=SUM(A3:A12)'
sheet['F4'] = '=AVERAGE(A3:A12)'
sheet['F5'] = '=MEDIAN(A3:A12)'
sheet['F6'] = '=STDEV(A3:A12)'
sheet['F7'] = '=COUNT(A3:A12)'
sheet['F8'] = '=CORREL(A3:A12,C3:C12)'

###Creating calculations for datasetB

sheet = workbook.sheets['Sheet1']
sheet['H2'] = 'CalcsB'
sheet['H3'] = '=SUM(C3:C12)'
sheet['H4'] = '=AVERAGE(C3:C12)'
sheet['H5'] = '=MEDIAN(C3:C12)'
sheet['H6'] = '=STDEV(C3:C12)'
sheet['H7'] = '=COUNT(C3:C12)'
sheet['H8'] = '=CORREL(A3:A12,C3:C12)'

Use Numpy to create the calculations

a. Create the calculations that you will populate into the spreadsheet, using Numpy

a = np.sum(datasetA_list)
b = np.average(datasetA_list)
c = np.median(datasetA_list)
d = np.std(datasetA_list,ddof=1) ## Setting DDOF = 0 will give a differnt figure, this corrects to match the output.
f = np.count_nonzero(datasetA_list)
g = np.corrcoef(datasetA_list,datasetB_list)

b. Create the headings and assign them to particular cells

sheet['E14'] = 'Numpy Calculations'
sheet['E15'] = 'Sum'
sheet['E16'] = 'Average'
sheet['E17'] = 'Median'
sheet['E18'] = 'Standard Deviation'
sheet['E19'] = 'Count'
sheet['E20'] = 'Correlation'

c. Assign the variables in step a to a set of cells

sheet['F15'] = a
sheet['F16'] = b
sheet['F17'] = c
sheet['F18'] = d
sheet['F19'] = f
sheet['F20'] = str(g)

d. Save the workbook and close it – This step is important, and always include.

And the final output looks like…

How to count the no of rows and columns in a CSV file

So you are working on a number of different data analytics projects, and as part of some of them, you are bringing data in from a CSV file.

One area you may want to look at is How to Compare Column Headers in CSV to a List in Python, but that could be coupled with this outputs of this post.

As part of the process if you are manipulating this data, you need to ensure that all of it was loaded without failure.

With this in mind, we will look to help you with a possible automation task to ensure that:

(A) All rows and columns are totalled on loading of a CSV file.

(B) As part of the process, if the same dataset is exported, the total on the export can be counted.

(C) This ensures that all the required table rows and columns are always available.

Python Code that will help you with this

So in the below code, there are a number of things to look at.

Lets look at the CSV file we will read in:

In total there are ten rows with data. The top row is not included in the count as it is deemed a header row. There are also seven columns.

This first bit just reads in the data, and it automatically skips the header row.

import pandas as pd

df = pd.read_csv("csv_import.csv") #===> reads in all the rows, but skips the first one as it is a header.

Output with first line used:
Number of Rows: 10
Number of Columns: 7

Next it creates two variables that count the no of rows and columns and prints them out.

Note it used the df.axes to tell python to not look at the individual cells.

total_rows=len(df.axes[0]) #===> Axes of 0 is for a row
total_cols=len(df.axes[1]) #===> Axes of 1 is for a column
print("Number of Rows: "+str(total_rows))
print("Number of Columns: "+str(total_cols))

And bringing it all together

import pandas as pd

df = pd.read_csv("csv_import.csv") #===> reads in all the rows, but skips the first one as it is a header.

total_rows=len(df.axes[0]) #===> Axes of 0 is for a row
total_cols=len(df.axes[1]) #===> Axes of 0 is for a column
print("Number of Rows: "+str(total_rows))
print("Number of Columns: "+str(total_cols))

Number of Rows: 10
Number of Columns: 7

In summary, this would be very useful if you are trying to reduce the amount of manual effort in checking the population of a file.

As a result it would help with:

(A) Scripts that process data doesn’t remove rows or columns unnecessarily.

(B) Batch runs who know the size of a dataset in advance of processing can make sure they have the data they need.

(C) Control logs – databases can store this data to show that what was processed is correct.

(D) Where an automated run has to be paused, this can help with identifying the problem and then quickly fixing.

(E) Finally if you are receiving agreed data from a third party it can be used to alert them of too much or too little information was received.

Here is another post you should read!

How to change the headers on a CSV file

how to copy/paste special a range of cells with xlwings

Are you using Microsoft Excel in conjunction with Python for your data analytics projects, but have a need to automate certain tasks?

In this blog post we will take you through how to remove formulas in a cell , and replace them with their returned values.

This is achieved through using xlwings, a very powerful library that can be used with Python.

So what we want to do is remove the formulas in an excel sheet, normally this is achieved through “copy and paste special values” in excel.

Below is a screenshot of the before:

In order to remove the formulas we use the following code:

This code basically loads the file( input) and looks for the range F2:F5.

Then using the the xlwings functionality, it makes the old file range values equal to the new range values.

The difference is that it looks at what the vlookup returned value to the cell and not the formula.

from openpyxl import load_workbook
import xlwings as xlfile

filepath_input = r'your file path here'
filepath_output = r'your file path here'

input_workbook = load_workbook(filepath_input)
output_workbook = load_workbook(filepath_output)

ws = input_workbook['Sheet1']

### Removing formulas in the spreadsheet

oldlist = xlfile.Book(filepath_input)
newlist = xlfile.Book(filepath_output)

my_values = oldlist.sheets['Sheet1'].range('F2:F5').options(ndim=2).value

my_values1 = newlist.sheets['Sheet1'].range('F2:F5').options(ndim=2).value

newlist.sheets['Sheet1'].range('F2:F5').value = my_values1

The output is a new file , with the formulas removed!

And there you go, there are other options though.

Theoretically you don’t have to create a new sheet like I did above, that was done to show the before and after, otherwise the input file is overwritten, and if that is what you need then your problem is solved!

In rolling out this solution, there are other options out there as well, I found this the simplest to implement.

Openpyxl can be used and it was the most common suggestion , but I found its implementation not as straight forward.

Python Tutorial: How to create charts in Excel

Estimated reading time: 4 minutes

Taking it further with a python excel chart.

We have created some video content here using Python with Excel that illustrates the different ways you can leverage Python to

  • Data cleanse
  • Find unwanted characters
  • And see if the file is empty before import!

What this video is giving as output

Here we are looking to introduce charts in Excel, and how how to use Python to easily work with your data and export to an excel sheet.

Below is the final output of our two charts, this is for illustrative purposes, and is taken from the Irish Government’s website as at 1st May 2020, importing the cell ranges associated with them.

A barchart of covid cases bullt in Python
a line chart of covid cases built in Python

How we went about it

Step 1 – Importing the data

import pandas as pd
import openpyxl

importfile = openpyxl.load_workbook("C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19CountyStatisticsHPSCIreland01052020.xlsx")

#Telling the program to go to each individual sheet
Dublin = importfile["Dublin"]
Cork = importfile["Cork"]
Limerick = importfile["Limerick"]
Galway = importfile["Galway"]

#Telling the program for each sheet , create a dataframe.
dfDublin = pd.DataFrame(Dublin.values)
dfCork = pd.DataFrame(Cork.values)
dfLimerick = pd.DataFrame(Limerick.values)
dfGalway = pd.DataFrame(Galway.values)

Step 2 – Splitting the different sheets out into different data frames

### Showing to break down here, further on I cocatenate the dataframes  ###

#This brings in all the values from the sheet into a variable so they can be read.
datadublin = Dublin.values
datacork = Cork.values
datalimerick = Limerick.values
datagalway = Galway.values
#This tells the program that the headers are on row 0.
columnsdublin = next(datadublin)[0:]
columnscork = next(datacork)[0:]
columnslimerick = next(datalimerick)[0:]
columnsgalway = next(datagalway)[0:]
# The imported file has the column headings on row 0, moving them to be the actual column headings instead of numerical values.
dfDublin = pd.DataFrame(datadublin, columns=columnsdublin)
dfCork = pd.DataFrame(datacork, columns=columnscork)
dfLimerick = pd.DataFrame(datalimerick, columns=columnslimerick)
dfGalway = pd.DataFrame(datagalway, columns=columnsgalway)

Step 3 – Exports all three data frames to a separate tab each to the below file

# This starts the process of where to save the output by creating a writer variable
writer = pd.ExcelWriter('C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')
# Save the dataframe "Dublin" to Dublin sheet etc...
dfDublin.to_excel(writer, 'Dublin')
dfCork.to_excel(writer, 'Cork')
dfLimerick.to_excel(writer, 'Limerick')
dfGalway.to_excel(writer, 'Galway')
# save the excel file

Step 4 – Merging all the data to allow the charts to be created easily

mergedPDList = [dfDublin,dfCork,dfLimerick,dfGalway]  # Creating a new merged list of the dataframes, easier when doing charts below
mergeddf = pd.concat(mergedPDList) # physical merging of the dataframes

Step 5 – Creating a bar chart

import matplotlib.pyplot as plt
import matplotlib.pyplot as pltline
import seaborn as sns
#from openpyxl.drawing.image import Image
#This makes the seaborn library over ride the matplotlib properites for styling

plt.figure(figsize =(9,6)) = mergeddf['CountyName'],
       height = mergeddf['ConfirmedCovidCases']
plt.xticks(rotation = 45)
plt.title("No of Confirmed Covid Cases in the Republic of Ireland", fontsize=25, fontweight='bold')
plt.ylabel('No of cases')
plt.width = 35
img = plt.savefig('C://dataanalyticsireland/Python Code/youtube/codeused/files/barchart.png',bbox_inches='tight')

Step 6 – Creating a line chart

x = dfDublin['ConfirmedCovidCases']
x1 = dfCork['ConfirmedCovidCases']
x2 = dfLimerick['ConfirmedCovidCases']
x3 = dfGalway['ConfirmedCovidCases']
#Sets the size of the output before it is created
pltline.legend(['Dublin', 'Cork','Limerick','Galway'], fontsize="x-large")
pltline.ylabel('No of cases')
pltline.xlabel('No of days elapsed')
img1 = plt.savefig('C://dataanalyticsireland/Python Code/youtube/codeused/files/linegraph.png')

Step 7 – Exporting charts to the XLSX file

from openpyxl import load_workbook
#from openpyxl.drawing.image import Image
import openpyxl
#This step opens the existing file.
wsnew = load_workbook('C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')
#This step creates a new worksheet called Barchart
newsheet = wsnew.create_sheet("Bar Chart",4) #Creates a new sheet in fourth position
newsheet1 = wsnew.create_sheet("Line Graph",5) #Creates a new sheet in fifth position
#Adding the image to Newsheet which is called "Barchart" in the excel file
img = openpyxl.drawing.image.Image('C://dataanalyticsireland/Python Code/youtube/codeused/files/barchart.png')
img1 = openpyxl.drawing.image.Image('C://dataanalyticsireland/Python Code/youtube/codeused/files/linegraph.png')
img.anchor = 'B2' # Tells what cell to put the imahe in
img1.anchor = 'B2' # Tells what cell to put the imahe in
newsheet.add_image(img) # Adds the image
newsheet1.add_image(img1) #Adds the image'C://dataanalyticsireland/Python Code/youtube/codeused/files/Covid19output01052020.xlsx')

In the below video on creating a python excel chart, we have approached this as follows:

  • Created four separate data frames, they are the four regions that will feed into the creation of the graphs.
  • Separate to this, we merged the four data frames into one, to use with the bar chart.

And to finish off

If you like what this video has explained, please click to see our YouTube channel for more informative videos.

Data Analytics Ireland

How To Validate Cell Values In Excel

Estimated reading time: 2 minutes

Validating cells in Excel quickly – how to do it easily!
Are you working with large spreadsheets and looking to quickly at data validation exercise to save you time?

The aim would be to run your code and test it against some predefined rules you or your data analyst would have written to make sure it brings back the expected checks.

If you look at the below, this is the final output of this video, highlighting two cells that are over budget based on the companies predefined budget.

data validation example

The structure of this code can be broken down into the following steps:

  • Read in the excel file, see a previous example here How to import data into excel
  •  Run the first function,  checks if the spreadsheet cell value is over or under budget.
  •  Run the second function that takes the value from the first function and applies the colour red to the cell if it is over budget.



You can expand this code to incorporate more functionality, such as:

  • Change the colour of the cells, to have multiple colours returned.
  •  Update the two functions to include more business rules.
  •  You could check if the file is empty before processing as shown here How to check if a file is empty

Please subscribe to our YouTube channel, the button is the right-hand side of the page if you would like to see more like these.

Data Analytics Ireland

How to import data into excel

Estimated reading time: 4 minutes

This import will not cost you anything except running some code!
The need to productively have an all in one solution to manage your data as your code has become more critical as volumes of data become larger. Do you, as a data analyst, therefore, need to send your data into an excel file? Previously we would have posted a video How do I remove unwanted characters, and here we build on that theme, linking in with Excel.

Two techniques used here to achieve this are  XLSX writer explained, and Openpyxl explained.

The elements we cover off are:

  • Load data from a data frame and populate it into an excel file.
  •  Renaming of a sheet.
  • Creating a new sheet and giving it a name.
  • We look at properties, namely changing the colour of a tab to yellow.
  • You may need to put some text in a sheet cell, to act as a header or to show you have a total figure there.
  • And the final piece of functionality covered in this video is how to copy data from one sheet into another one.


There are several benefits to putting all the upfront work in Python:

  1. The benefits of cleansing the data or format will save time further down the road.
  2. After you receive the document, you can quickly review it without fixing errors in the data.
  3. If you are distributing the output to several people, it quickly gets them what they want, without manual intervention after the logic has completed.

I have certainly benefited from this data cleansing and importing it into excel exercise, as the two are combined now, makes it a more efficient process.

Please remember to subscribe to our channel if you like the work we are doing, thanks!

Data Analytics Ireland

The code used in this video is here:

########## - Previous video - How to check for invlaid characters in your dataset ###########


#######################This video - How to import data into Excel #########################################

#In Python 3, leading zeros(at the start of a sequence of numbers) are not allowed on numbers
#Numbers need to be put between "" to be looped through

import pandas as pd
#Create a dataset locally
data = {'Number':  ["00&00$000", '111$11111','2222€2222','3333333*33','444444444£','555/55555','666666@666666'],
       'Error' : ['0','0','0','0','0','0','0']}

#Create a dataframe and tell it where to get its values from
df = pd.DataFrame (data, columns = ['Number','Error'])

#Function to loop through the dataset and see if any of the list values below are found and replace them
def run(*l):
    #This line extracts all the special characters into a new column
    #Using regular expressions it finds values that should not appear
    df['Error'] = df['Number'].str.extract('(\D+|^(0)$)')   
    #This line removes anything with a null value
    df.dropna(subset=['Error'], inplace=True)
    #This line reads in the list and assigns it a value i, to each element of the list.
    #Each i value assigned also assigns an index number to the list value.
    #The index value is then used to check whether the value associated with it is in the df['Number'] column 
    #and then replaces if found
    for i in l:
        df['Fix']= df['Number'].str.replace(i[0],"").str.replace(i[1],"").str.replace(i[2],"").str.replace(i[3],"") \
        print("Error list to check against")

#This is the list of errors you want to check for
l = ['$','*','€','&','£','/','@']     

import xlsxwriter
#### This is the start of the code to import the data to excel

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter(r"C://dataanalyticsireland/Python Code/youtube/codeused/files/exportxlsx.xlsx")

################ XlsxWriter can only create new files. It cannot read or modify existing files.#################

# Position the dataframes in the worksheet, and names the sheet Analysis
df.to_excel(writer, sheet_name='Analysis')  # Default position, cell A1.

# Close the Pandas Excel writer and output the Excel file.

### Modify the exportxlsx.xlsx file created above through openpyxl ###

#Load xlsx file, and rename tab "Analysis" to "NewAnalysis"
from openpyxl import load_workbook
importwb = load_workbook("C://dataanalyticsireland/Python Code/youtube/codeused/files/exportxlsx.xlsx")
existing_sheet = importwb['Analysis']
#existing_sheet.title = 'NewAnalysis'

#Create new sheet on existing excel sheet, 1 signifies that it is to be created after the first sheet
#The first sheet been value 0
newsheet = importwb.create_sheet("Newsheet", 1) # insert after the first sheet as we know it already exists.

#Setting the tab colour of the newsheet to yellow
newsheet.sheet_properties.tabColor ='FFFF00'

#Putting text into a particular cell
newsheet['B3'] = "This is some text"

#Tell the code to go to the sheet we want to copy
#Tell the code to copy the data to the newsheetcopy. It creates a new sheet and copies the data in.
#newsheetcopy=importwb.copy_worksheet(sourcesheet)"C://dataanalyticsireland/Python Code/youtube/codeused/files/exportxlsx.xlsx")