Estimated reading time: 5 minutes
So you are working with large amounts of data, and the data is having transformations applied to it or quite simply you need to see as you move data, has it changed in any way.
The question that arises is if I have lots of validations to complete, how can I quickly see if they pass or fail?
Here we will take you through unittest, a Python testing framework that will enable a quick way to validate data you are working with.
Step 1 – Define your test scenarios
Like in any testing, the first step is to define what you want to test and what is your expected outcome. Here we are testing data that was transformed into a different value.
The test then looks at the new values and compares them to a list of values that are expected in the output.
As can be seen, the above two rows will be what we will use to create our test cases, which will be outlined below.
While the below may seem very strict, they emphasise exactly what we are testing for. Either all data passes the test or it does not pass.
In reality, these rules can be amended/relaxed depending on what you need to test for, this is purely for illustrative purposes:
Step 2 – Create our python file that will pass in the transformed data
This is the first file we need, it will be the source file that will be used to transform the data and then pass to the testing module later on.
In this instance, it is called “automate_testing_python.py”
I have created a dictionary here, but you can read in a CSV file or any file that you wish. Just make sure that it is passed to the data frame.
import pandas as pd
data = {'Name_before':['Joe','Liz','Pat'],'Name_after':['Joseph','Elizabeth','Patrick']} #===> Creating a dictionary
df = pd.DataFrame(data) #===> Creating a dataframe
def main():
transfromations_applied = data['Name_after'] #===> Defines the column you want to test against
return transfromations_applied #===> Returns the values that will be used in the automation script
The function “main” above defines clearly what column holds the data to be passed to the automated test, in this instance ” transfromations_applied “
Step3 – Create a test script file
In the below there are a number of steps to follows:
- Install the package HTMLTestrunner in whatever solution you are using.
- Install the Unittest library to the same program. This will perform the tests asked and provide the result outputs.
- Next we are importing what was returned from “main” function in the file “automate_testing_python.py”
- The next step is the creation of the class, this serves the following purposes:
- Defines the exact tests we want to carry out.
- In the first method, it clearly defines what values are expected and what should be tested against.
- The second method is purely checking that all the values are alphabetic.
- The last line is running a file that will run the tests and then return the values to the output html file.
from HTMLTestRunner import HTMLTestRunner
import unittest
from automate_testing_python import main #===> Imports the data that was returned
#This is the code that performs the tests
#Each method is a specific test
class test_values(unittest.TestCase):
def test_ValueEquals(self):
ExpectedValues = ['Joseph','Elizabeth','Patrick']
values_received = main()
self.assertListEqual(ExpectedValues,values_received)
def test_Valueisalphabetic(self):
values_received = main()
for c in values_received:
self.assertTrue(c.isalpha())
if __name__ == "__main__":
HTMLTestRunner.main()
Step 4 – Create the File that will generate the output HTML file with the results.
As a result of the above steps we have created the data, brought it in to test against predefined test scenarios.
Now, all we have to do is present the findings in an HTML file.
In this instance, the output will be easier to understand, and quicker to digest.
In the below code, it follows the below steps:
(A) The first and second lines import the libraries we will need to run this bit of code.
(B) The third line is receiving the values that were generated from the class in step 3.
(C) All the next four lines do, is pull in the data, attach it to the myreport.html file
(D) The last two lines do the following:
- Tell the program to take the data received and then use the HTMLTestRunner package to process it, and create the template output.
from HTMLTestRunner import HTMLTestRunner
import unittest
from test_automate_testing_python import test_values
if __name__=='__main__':
suite=unittest.makeSuite(test_values)
filename='C:\\Users\haugh\OneDrive\dataanalyticsireland\YOUTUBE\how_to_automate_testing_with_python\myreport.html'
fp=open(filename,'w')
runner=HTMLTestRunner.HTMLTestRunner(fp,title=u'Data Analytics Ireland Test Output',description=u'This shows the testing output result')
runner.run(suite)
Step 5 – Run the file html_report.py
Running this file will also open and run the code in the other files referenced in this post.
As a result, when the program is finished the output will look like this:
Earlier in the program we created it a dictionary (STEP 1) like this:
data = {‘Name_before’:[‘Joe’,’Liz’,’Pat’],’Name_after’:[‘Joseph’,’Elizabeth’,’Patrick’]} #===> Creating a dictionary
and then in Step2 in our class “test_values” we defined the expected outputs.
def test_ValueEquals(self): ExpectedValues = [‘Joseph’,‘Elizabeth’,‘Patrick’]
In this instance, the values are the same, the above output will mark the two tests as a Pass.
But if the values passed in are:
data = {‘Name_before’:[‘Joe’,’Liz’,’Pat’],’Name_after’:[‘Joseph’,’Elizabeth’,‘Patrick1’]} #===> Creating a dictionary
and are compared to
def test_ValueEquals(self): ExpectedValues = [‘Joseph’,‘Elizabeth’,‘Patrick’]
The following output will appear, as the two tests will fail:
So we have created two test cases here, but we could add loads more and then have the automation quickly return the test results for us.