How to run Python validation from Javascript

Estimated reading time: 6 minutes

More and more, website developers are looking to blend different programming technologies to allow them to use functionality, that maybe is not available to them in the current programming language they utilise.

In previous posts how to pass python variables to Javascript and how to run python directly from javascript, we touched on how to use Python and Javascript interchangeably to pass data.

Here in this blog post, we are going to look at data validation using Python, with data captured by Javascript. The purpose here is to allow another way to validate data other than relying on javascript.

Python Validation run from Javascript

There are a number of reasons for this:

  1. You may want to use Python to process the data and provide output.
  2. The data could be passed from Python into a database, you need to run some checks before you proceed.
  3. Python is your core language, you only use Javascript to capture the data and pass it from a website.

There are four validation checks

  1. Validation if a username exists.
  2. Validation if a number has been entered on a username field.
  3. Validation of a password entered.
  4. A validation check to make sure that an email address entered is in the correct format.

We have two files with the requisite code in them as follows:

  1. app.py ===> As this application is built in Python Flask, this file holds the logic to open the website files, and it holds the Python validation checks.
  2. index.html ====> This file holds the HTML that creates the pages, labels, and input boxes that appear in the web browser. It also includes the Javascript that will capture the data to be passed to Python for validation.

Code Overview – App.PY

Below, I will look to explain what is going on as follows:

The variable Regex is used in the email validation to check if the email entered is correct.

def inputcheck ===> This is just checking the username passed over and performing some validations on it.

def inputvalidation ====> This also is checking the username but looking at if numbers only are entered, or if it is empty.

def passvalidation ====> In this piece of logic, it checks if the password is empty, less than five characters, is numbers only.

def emailvalidation ====> All this is doing is checking if the data received from the Javascript is in the correct email format. It references the regex variable above, which is used to confirm if the format is correct or otherwise.

Within each of the functions, there are if statements which are the core validation checks used by Python.

Also, all the popup menus use ctypes, which allows you to access the windows libraries that hold message boxes, customise them and call within your program.

import json
import ctypes
import re
from flask import request,redirect,url_for

from flask import Flask, render_template

app = Flask(__name__)

regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$'

@app.route('/')
def index():
    return render_template('index.html')


@app.route('/inputcheck', methods=['GET','POST'])
def inputcheck():
    output = request.get_json()
    result = json.loads(output) #this converts the json output to a python dictionary
    username_list = ['joe','john']
    if result['username'] in username_list:
        MessageBox = ctypes.windll.user32.MessageBoxW(0, 'Username ' + result['username'] + ' ' + 'exists', "Username check", 0x00001000)
        return render_template('various.html')
    elif result['username'] == '':
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'You cannot enter an empty value', 'Username check', 0x00001000)
    else:
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Username ' + result['username'] + ' ' + 'does not exist', 'Username check', 0x00001000)

@app.route('/inputvalidation', methods=['GET','POST'])
def inputvalidation():
    output = request.get_json()
    result = json.loads(output) #this converts the json output to a python dictionary
    if result['username'].isdecimal():
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your username cannot be numbers',"Number check", 0x00001000)
    elif result['username'] == '':
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'The username cannot be empty', "Number check",0x00001000)
    else:
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your username looks ok', "Number check", 0x00001000)
    return render_template('index.html')

@app.route('/passvalidation', methods=['GET','POST'])
def passvalidation():
    output = request.get_json()
    result = json.loads(output) #this converts the json output to a python dictionary
    if result['password'].isdecimal():
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your password cannot be numbers',"Password check", 0x00001000)
    elif result['password'] == '':
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'The password cannot be empty', "Password empty check",0x00001000)
    elif len(result['password']) < 5:
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your username should be greater than five characters', "Password length check", 0x00001000)
    else:
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your password looks ok', "Number check", 0x00001000)
    return render_template('index.html')

@app.route('/emailvalidation', methods=['GET','POST'])
def emailvalidation():
    output = request.get_json()
    result = json.loads(output) #this converts the json output to a python dictionary
    if re.search(regex, result['email']):
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your email is in the correct format', "Email check", 0x00001000)
    else:
        MessageBox = ctypes.windll.user32.MessageBoxW(None, 'Your email is invalid, please correct', "Email check", 0x00001000)
    return render_template('index.html')



if __name__ == "__main__":
    app.run(debug=True)


Code Overview – Index.html

The below code is the page code, that allows the capturing of data through a web browser HTML page.

Between the <style></style> tags these are the CSS properties that format the labels and buttons on the page.

Within the <div></div> tags are where we create the labels and buttons to show on the screen. Also within these, are the references to the on click event function that should run once the buttons are clicked.

Within the <script></script> tags, this is where the javascript is written which captures the data entered into the input boxes within the <div> tags.

Also, you will see within this javascript, there are sections called $.ajax, which is where the data captured by the javascript is stored, and then passed onto the Python script file (app.py)

Note that in each ajax , url:”/APAGENAME” is referenced. These are the sections within the app.py that the data is passed to, and then the Python logic kicks in and validates the data.

<html lang="en">

<head>

    <title>Data Analytics Ireland</title></head>
<style>
.button1 {
    position: absolute;
    top: 50%;
    left: 55%;
    width: 900px;
    height: 300px;
    margin-left: -300px;
    margin-top: -80px;
}



.labels {
    position: absolute;
    top: 50%;
    left: 55%;
    width: 900px;
    height: 300px;
    margin-left: -300px;
    margin-top: -150px;
}

</style>

<body>



<div class="labels" id="mem1" >
<label  for="username">username:</label> <input type="text" id="username" name="username">
<label for="password">password:</label><input type="text" id="password" name="password">
    <label for="email">email:</label><input type="text" id="email" name="password">
</div>


<div class="button1" id="mem2" >
<button type="submit" onclick='myfunction();'>Click here to validate your username</button>
<button type="submit" onclick='myfunctionval();'>Click here to check if a number</button>
    <button type="submit" onclick='myfunctionpass();'>Click here to check password</button>
      <button type="submit" onclick='myfunctionemail();'>Click here to check your email</button>
</div>


<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js"></script>

<script>
    function myfunction() {

        const username = document.getElementById("username").value;
        const password = document.getElementById("password").value;


        const dict_values = {username, password} //Pass the javascript variables to a dictionary.
        const s = JSON.stringify(dict_values); // Stringify converts a JavaScript object or value to JSON.
        console.log(s); // Prints the variables to console window, which are in the JSON format
        //window.alert(s)
        $.ajax({
            url:"/inputcheck",
            type:"POST",
            contentType: "application/json",
            data: JSON.stringify(s)});
}
</script>
<script>
    function myfunctionval() {

        const username = document.getElementById("username").value;
        const password = document.getElementById("password").value;


        const dict_values = {username, password} //Pass the javascript variables to a dictionary.
        const s = JSON.stringify(dict_values); // Stringify converts a JavaScript object or value to JSON.
        console.log(s); // Prints the variables to console window, which are in the JSON format
        //window.alert(s)
        $.ajax({
            url:"/inputvalidation",
            type:"POST",
            contentType: "application/json",
            data: JSON.stringify(s)});
}
</script>

<script>
    function myfunctionpass() {

        const username = document.getElementById("username").value;
        const password = document.getElementById("password").value;


        const dict_values = {username, password} //Pass the javascript variables to a dictionary.
        const s = JSON.stringify(dict_values); // Stringify converts a JavaScript object or value to JSON.
        console.log(s); // Prints the variables to console window, which are in the JSON format
        //window.alert(s)
        $.ajax({
            url:"/passvalidation",
            type:"POST",
            contentType: "application/json",
            data: JSON.stringify(s)});
}
</script>

<script>
    function myfunctionemail() {

        const email = document.getElementById("email").value;


        const dict_values = {email} //Pass the javascript variables to a dictionary.
        const s = JSON.stringify(dict_values); // Stringify converts a JavaScript object or value to JSON.
        console.log(s); // Prints the variables to console window, which are in the JSON format
        //window.alert(s)
        $.ajax({
            url:"/emailvalidation",
            type:"POST",
            contentType: "application/json",
            data: JSON.stringify(s)});
}
</script>




</body>
</html>

How to Pass Python Variables to Javascript

Estimated reading time: 4 minutes

In our recent blog posting How to Pass a Javascript Variable to Python using JSON, we demonstrated how to easily use AJAX to pass whatever data you wanted and then manipulate it with Python.

In this blog posting, we are going to show how to do this the other way around. The scenario is that you have an application and or website that wants to use data generated through Python, but let Javascript then use it within the application.

As Python can be connected to numerous databases and files ( txt, excel) etc, this piece of logic is very useful for the programmer looking to integrate both programming languages.

Let’s start looking at the code, and see how this can be achieved.

Step 1 – What Files are generated?

This program uses Python Flask to create a web page, that has a drop-down menu. The two files used to generate this are as follows:

(A) app.py – This is the python file that creates a website and loads a template HTML file as outlined below.

(B) Index.html – This is the template file that loads into the browser and runs all the javascript. The javascript loaded here also loads the python data passed over from app.py

Step 2 – APP.PY code overview

The Python library that enables webpage creation is called Flask, and as can be seen below it has to be imported.

In addition, we need to also import render_template which tells the program to go to the templates folder and load “Index.HTML”

The variable that is been passed to JavaScript is called name, and these are the values that you will see in the web browser when it is loaded.

from flask import Flask, render_template

app = Flask(__name__)

@app.route('/')
def index():
    name = ['Joe','John','Jim','Paul','Niall','Tom']
    return render_template('index.html', name=name)

if __name__ == "__main__":
    app.run(debug=True)

Step 3 – Index.HTML overview

Here is the template HTML file that runs in the browser. You can add CSS etc to this to make it look nicer and more user friendly.

As you can see it has the usual HTML tags appear as part of a website.

Well look at some of the code further:

In this bit <select id =’select’> </select>, this is the dropdown menu that will appear when Index.html is opened. It will store all the values passed from python. Note that its id is “select”, this will be used later on.

The main parts to focus on next is between <script></script>. This is what reads in the python data and populates it to the dropdown menu.

In step 2 we mentioned that there was a variable called “name”, with values to be passed over.

This is acheived on this line:

var select = document.getElementById(“select”), test = {{ name | tojson }};

Notice that name appears here, and this is referencing back to the exact same value that was discussed in step 2.

For the rest of the lines, I have explained with comments what each does.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Pass Python variable to Javascript</title>
</head>
<body>

<select id ='selectvalue'>
</select>
<script>
    //name = ['Joe','John','Jim','Paul','Niall','Tom']

    var selectvalue = document.getElementById("selectvalue"), test = {{ name | tojson }};
    //The increment operator (++) increments (adds one to) its operand and returns a value.
    for(var i = 0; i < test.length; i++) // This line checks for the length of the data you feeding in i.e the no of items
         {
var selection = document.createElement("OPTION"), // This line creates a variable to store the different values fed in from the JSON object "TEST"
txt = document.createTextNode(test[i]); // This just reads each value from the test JSON variable above
selection.appendChild(txt); // This line appends each value as it is read.
selection.setAttribute("value",test[i]); // This line sets each value read in as a value for the drop down
selectvalue.insertBefore(selection,selectvalue.lastChild); //This reads eah value into the dropdown based on the order in the "TEST" above.
 }
</script>
</body>
</html>

Step 4 – What the output looks like!

From step 2, these are values we asked to be used in Javascript to populate a dropdown:

name = [‘Joe’,’John’,’Jim’,’Paul’,’Niall’,’Tom’]

Python variable passed to Javascript

How to pass multiple lists to a function and compare

Estimated reading time: 3 minutes

Have you ever been faced with a situation where you have multiple lists you want to compare, and need a function that will read them in and show you what is different between them all?

In this scenario, this can be handy where you have large data sets and need a quick way to find those differences and fix them where appropriate.

Comparing two lists

Let’s look and see what is going on below.

First of all, we have defined two lists. The only difference between the two is that one has a value of six, the other does not.

Next, we have the function “comparelists”. What this is doing is taking in the two lists as parameters (a,b), and then processing them.

The lists are passed as arguments to the parameters in this line ===> comparelists(list1,list2)

The parameters a are assigned to list1, and b is assigned to list2.

The main bit of the function is the list comprehension, and it is doing the following:

  1. x for x is basically creating a variable x, and starting a loop that goes through all the values of b.
  2. Each iteration of x is stored and compared with a.
  3. “If x not in a” completes the comparison and if true returns the value, otherwise moves to the next value.

As a result of this logic, it can be seen that six is the only value returned, and this is what we are expecting based on a visual inspection.

#Reading in two lists
list1 = [1,2,3,4,5] # parameter a below
list2 = [1,2,3,4,5,6] # parameter b below

def comparelists(a,b):
    z = [x for x in b if x not in a] #list comprehension
    print(z)

comparelists(list1,list2)

Output:

[6]

Comparing more than two lists

In the logic above, we saw that two lists can be compared. But what if you want to compare one list against two other lists?

That is easily done, with some tweaking to the code.

As before the three lists are defined, created as arguments in comparelists(list1,list2,list3), and then passed to the function parameters a,b,c.

The only difference in this logic is that the list comprehension is written a little different as follows:

  1. x for x is basically creating a variable x, and starting a loop that goes through all the values of b. ====> This is the same as above
  2. Each iteration of x is stored and compared with a and c ===> This line is different as comparing to two lists now.
  3. “if x not in a and x not in c” ====> Here we have two comparisons
    • Checking for the value x from b in a
    • Checking for the value x from b in c
  4. The value of six is correct as it is not in either a or c.
  5. Note the if statement is specific as to what to check for with the “and” statement. This can be changed to suit your needs.
list1 = [1,2,3,4,5] # parameter a below
list2 = [1,2,3,4,5,6] # parameter b below
list3 = [1,2,3,7,8,9] # parameter c below

def comparelists(a,b,c):
    z = [x for x in b
         if x not in a and x not in c] #list comprehension
    print(z)

comparelists(list1,list2,list3)

Output:
[6]

How to run Python directly from Javascript

Estimated reading time: 3 minutes

Are you developing a website and looking to execute some Python script directly from the website?

Here we take you through how to achieve this using Python Flask, the scenarios we demonstrate are as follows:

  1. Javascript on load event – execute Python script
  2. Javascript on click event – execute a Python script

In both scenarios above you will be presented with pop-up boxes like below, but the python code can be changed to whatever you like.

Page load message box
On button press, calls the python script to load a message box

The code you will need to run this is split into two. (A) The Index.HTML logic and (B) the app.py logic

The code for the INDEX.HTML page

This is the page that the user is presented with. As you can see the logic has two pieces of javascript that when run, go over to the python code and run some logic that it wishes the webpage to run.

In this instance, the app.py python script holds the commands that need to be executed.

Note the page load event goes to the route in app.py, namely ” @app.route(‘/’)”

Whereas the button click event is pointed directly at “@app.route(‘/test’ )” – essentially this will not run until the javascript asks it to, on the button click event.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Sample Home Page</title>
</head>
<body>
<! -- Calls a function on the page load. -- >
<element onload="myfunction_onload">
<button type="submit" onclick='myfunction_clickevent()'>Run my Python!</button>
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js"></script>
<! -- This script runs the python script on the page load -->
<script>
    function myfunction_onload(){
        $.ajax({
            url: "app.py",
             context: document.body
            })
        }
    </script>
<! -- This script runs the python script when the button is clicked -->
<script>
    function myfunction_clickevent(){
        $.ajax({
            url:"/test",
            context: document.body});}
</script>


</body>
</html>
How to run Python directly from Javascript

The code for the APP.PY Python script

Here we are linking to the index.html page and taking commands from it and executing those requests.

In this scenario, the message boxes are rendered by using the win32api package.

The app is using the flask package to run the website.

The powerful thing here is you can start customising this logic to do what you like, examples include:

(A) Return graphs to the webpage.

(B) Process data received and return statistics on the data.

(C) Validate data received in Python and return a response. An example here could be a user logging into a database.

import win32api
from flask import Flask, render_template

app = Flask(__name__)


#Using the below, the popup message appears on the page load of index.html
#0x00001000 - This makes the popup appear over the browser window
@app.route('/')
def index():
    win32api.MessageBox(0, 'You have just run a python script on the page load!', 'Running a Python Script via Javascript', 0x00001000)
    return render_template('index.html')

#Using the below, the popup message appears when the button is clicked on the webpage.
#0x00001000 - This makes the popup appear over the browser window
@app.route('/test')
def test():
    win32api.MessageBox(0, 'You have just run a python script on the button press!', 'Running a Python Script via Javascript', 0x00001000)
    return render_template('index.html')

if __name__ == "__main__":
    app.run(debug=True)

What is an array in Python?

Estimated reading time: 3 minutes

Python arrays are used to manage how data is analysed and have the data in a structured form for the data analyst or data scientist to use.

What is an Array?

An array has the following properties:

  1. The data in it is of the same data type.
  2. The data is stored as contiguous memory locations, meaning meaning the lowest value is at index 0 and the last value is at the highest index value.
  3. Arrays assign index values to each data point contained within it.
  4. You can append values to the array.
  5. You can delete values from the array, that is it is mutable.

What are the differences between arrays and lists?

For the most part arrays and lists are the same, but with one difference:

A list can store any data type you want e.g. strings, integers etc.

On the other hand, arrays can only store data that is of the same data type.

What are the different ways that I can create an array?

(A) Use Numpy

import numpy as np

a = np.array([1,2,3])

print(a)
print(type(a))

Output:
[1 2 3]
<class 'numpy.ndarray'>

(B) Use array

import array as test_array

a = test_array.array('i',[1,2,3])

print(a)

Output:
array('i', [1, 2, 3])
<class 'array.array'>

On the Python.org website, below are the list of values that can be populated into the above program, depending on what your need is:

When should I use arrays?

It really depends on the nature of your python program, but below are some examples that may help you make a decision:

(A) Many variables of the same type : There maybe a scenario where you have to create an array to store data that is of the same data type. For example you have a list of codes to look up against, which are all integers.

(B) Faster and more efficient : If speed is what you are looking for using arrays, will help improve performance of your computer program, using lists is much slower.

(C) Compactability and efficiency : If the nature of your program needs to store large amounts of data that needs to be accessed quickly , then this would be a good reason to use them.

(D) Ability to retrieve data quickly through indexing: As arrays have index values associated with them, they data can be easiy retrieved.

(E) Need to compute some mathematical values: Arrays are excellent for any numerical operations you need to complete, as the level of coding is minimal.

import array as test_array

a = test_array.array('i',[1,2,3])

mydivider = 2

mynewlist = [x / mydivider for x in a]
print(mynewlist)

result:
[0.5, 1.0, 1.5]

So in summary:

Speed , efficiency and ease of use are the main reasons to use an array.

We use arrays here in how to show percentage differences between files in python , why not go over and see it in action!

How to use parameters in Python

Estimated reading time: 5 minutes

Are you using a function and looking to pass values into it to be processed? Well if this is the case you are trying to pass arguments to that function’s parameters so they will return a value for you.

Parameters are just values that are associated with the function. For each function, you are going to have one or more parameters it is expecting to process.

As a result, there are a number of ways they are created, let’s explore how this can be achieved.

Passing arguments directly into a function parameters

In a lot of instances, the argument values to be processed are going to be passed into the function parameters, which then executes some logic and returns a value.

Typical ways this may happen:

  1. A password is entered ( the inputted argument), the function then confirms if it is valid or not.
  2. You may need to check if a value entered ( the inputted argument) is above or below a certain value. It is passed to a function parameter and then the function does the necessary checks, and confirms the result in the output.

The above are two examples, but let’s see this in action.

Below we have a function. The purpose is to take in two arguments passed to it and conduct a calculation on them.

The calculation happens in the variable number 1 or number 2.

Variable number 1 and variable number 2 get their values from parameters a and b respectively.

The line multiplenumber(1,2) is just calling the function and passing the arguments to the parameters a,b which then processes them and the output is printed.

def multiplenumber(a,b):
    number1 = a*2
    number2 = b*3
    print(number1,number2) #===> These values are different to the variables below as they happen as a result of the processing inside the function.

number1=1
number2=2
multiplenumber(number1,number2)
print(number1,number2) #===> These values are outside the function and do not have any relationship to the variables inside the function above, hence are different.

Result:
2 6
1 2

The result of the above piece of logic is called “passing by value“, which essentially means the data is passed from the variables ( number 1 and Number 2) to the function parameters, but when the function is completed processing, it does not impact the original variables outside the function.

This is because the original variables are immutable.

Passing arguments directly into function parameters from a list

In the above example, as outlined, there will be no impact on the variables that pass the arguments into the function, they remain the same.

On the other hand, if the original source of the arguments is a mutable object, it will have its values changed.

Let’s look at a situation where this may occur.

def addnumber(a):
    number=['3']
    a.append(number)
    print(a) #===> This prints the original value of a with the addition of the value 3

number1=[1,2]
addnumber(number1)
print(number1) #===> As number1 is passed to the function, and number1 is allowed to be changed then this will print out the new list value once the function has completed processing.

Result:
[1, 2, ['3']]
[1, 2, ['3']]

So in this instance, as number1 is a list, and this is mutable, its value will change once the function completes its processing.

This can also be called “passing by reference“.

Passing arguments between functions

In another video, we completed called how to pass data between functions this discusses how you can pass the data between two functions and print it out.

Effectively in the below the arguments are name, country, city, and these values are passed to functionb

def functiona():
    name = "Joseph"
    country = "Mars"
    city = "atlantis"
    #this captures the three variables above
    return name,country,city
functiona()

def functionb():
    myname,mycountry,mycity = functiona()
    print(myname)
    print(mycountry)
    print(mycity)

functionb()

Result:
Joseph
Mars
atlantis

Passing arguments using *args as the parameter

In order to understand this, before we look at some code we need to understand what are *args?

*args essentially allows you to pass multiple positional arguments to a function, and this can vary over time.

As a result, you don’t have to assign the positional arguments to more than one variable. The *args take in all the arguments, no matter how long or short they are and then return a value.

def addnumbers(*args):
    result = 2
    for x in args:
        result = result * x
    return result

print(addnumbers(1,2,3))

Result:
12

Passing arguments using **kwargs as the parameter

In contrast to the above, this will process a set of values that have a name attached to them.

As can be seen in the previous example the arguments passed had no name attached to them.

Let’s look closely at an example. In this scenario, the arguments are assigned a value of a,b,c.

def addnumbers(**kwargs):
    result = 2
    for x in kwargs.values():
        result = result * x
    return result

print(addnumbers(a=1,b=2,c=3))

Result:
12

Here it is clear that the logic takes in each value of a,b,c processes it and then returns the result required.

In essence, both *args and **kwargs give you the same answer, it just depends on other parts of your code and how it is processing your data.

For example, in the **kwargs it could be that the values of a,b,c could be populated by other variables within the program.

So in that scenario, the variable that has been passed on can change and not be static.

So in summary:

a. Arguments can be passed multiple ways to functions, the parameters then just process them.

b. Parameters also don’t have to be assigned to a variable, they can be read in through *args.

c. On the other hand you can have multiple arguments read in at once by using the **kwargs.

How to Automate Testing With Python | unittest

Estimated reading time: 5 minutes

So you are working with large amounts of data, and the data is having transformations applied to it or quite simply you need to see as you move data, has it changed in any way.

The question that arises is if I have lots of validations to complete, how can I quickly see if they pass or fail?

Here we will take you through unittest, a Python testing framework that will enable a quick way to validate data you are working with.

Step 1 – Define your test scenarios

Like in any testing, the first step is to define what you want to test and what is your expected outcome. Here we are testing data that was transformed into a different value.

The test then looks at the new values and compares them to a list of values that are expected in the output.

As can be seen, the above two rows will be what we will use to create our test cases, which will be outlined below.

While the below may seem very strict, they emphasise exactly what we are testing for. Either all data passes the test or it does not pass.

In reality, these rules can be amended/relaxed depending on what you need to test for, this is purely for illustrative purposes:

Step 2 – Create our python file that will pass in the transformed data

This is the first file we need, it will be the source file that will be used to transform the data and then pass to the testing module later on.

In this instance, it is called “automate_testing_python.py”

I have created a dictionary here, but you can read in a CSV file or any file that you wish. Just make sure that it is passed to the data frame.

import pandas as pd

data = {'Name_before':['Joe','Liz','Pat'],'Name_after':['Joseph','Elizabeth','Patrick']} #===> Creating a dictionary

df = pd.DataFrame(data) #===> Creating a dataframe


def main():

    transfromations_applied = data['Name_after'] #===> Defines the column you want to test against
    return transfromations_applied #===> Returns the values that will be used in the automation script

The function “main” above defines clearly what column holds the data to be passed to the automated test, in this instance ” transfromations_applied “

Step3 – Create a test script file

In the below there are a number of steps to follows:

  1. Install the package HTMLTestrunner in whatever solution you are using.
  2. Install the Unittest library to the same program. This will perform the tests asked and provide the result outputs.
  3. Next we are importing what was returned from “main” function in the file “automate_testing_python.py”
  4. The next step is the creation of the class, this serves the following purposes:
    1. Defines the exact tests we want to carry out.
    2. In the first method, it clearly defines what values are expected and what should be tested against.
    3. The second method is purely checking that all the values are alphabetic.
  5. The last line is running a file that will run the tests and then return the values to the output html file.
from HTMLTestRunner import HTMLTestRunner
import unittest

from automate_testing_python import main #===> Imports the data that was returned


#This is the code that performs the tests
#Each method is a specific test
class test_values(unittest.TestCase):


def test_ValueEquals(self):

ExpectedValues = ['Joseph','Elizabeth','Patrick']
values_received = main()
self.assertListEqual(ExpectedValues,values_received)

def test_Valueisalphabetic(self):

values_received = main()
for c in values_received:
self.assertTrue(c.isalpha())



if __name__ == "__main__":
HTMLTestRunner.main()

Step 4 – Create the File that will generate the output HTML file with the results.

As a result of the above steps we have created the data, brought it in to test against predefined test scenarios.

Now, all we have to do is present the findings in an HTML file.

In this instance, the output will be easier to understand, and quicker to digest.

In the below code, it follows the below steps:

(A) The first and second lines import the libraries we will need to run this bit of code.

(B) The third line is receiving the values that were generated from the class in step 3.

(C) All the next four lines do, is pull in the data, attach it to the myreport.html file

(D) The last two lines do the following:

  1. Tell the program to take the data received and then use the HTMLTestRunner package to process it, and create the template output.
from HTMLTestRunner import HTMLTestRunner
import unittest
from test_automate_testing_python import test_values

if __name__=='__main__':
    suite=unittest.makeSuite(test_values)
    filename='C:\\Users\haugh\OneDrive\dataanalyticsireland\YOUTUBE\how_to_automate_testing_with_python\myreport.html'
    fp=open(filename,'w')
    runner=HTMLTestRunner.HTMLTestRunner(fp,title=u'Data Analytics Ireland Test Output',description=u'This shows the testing output result')
    runner.run(suite)

Step 5 – Run the file html_report.py

Running this file will also open and run the code in the other files referenced in this post.

As a result, when the program is finished the output will look like this:

Earlier in the program we created it a dictionary (STEP 1) like this:

data = {‘Name_before’:[‘Joe’,’Liz’,’Pat’],’Name_after’:[‘Joseph’,’Elizabeth’,’Patrick’]} #===> Creating a dictionary

and then in Step2 in our class “test_values” we defined the expected outputs.

def test_ValueEquals(self): ExpectedValues = [‘Joseph’,‘Elizabeth’,‘Patrick’]

In this instance, the values are the same, the above output will mark the two tests as a Pass.

But if the values passed in are:

data = {‘Name_before’:[‘Joe’,’Liz’,’Pat’],’Name_after’:[‘Joseph’,’Elizabeth’,‘Patrick1’]} #===> Creating a dictionary

and are compared to

def test_ValueEquals(self): ExpectedValues = [‘Joseph’,‘Elizabeth’,‘Patrick’]

The following output will appear, as the two tests will fail:

So we have created two test cases here, but we could add loads more and then have the automation quickly return the test results for us.

ValueError: cannot convert float NaN to integer

Estimated reading time: 2 minutes

Sometimes in your data analytics project, you will be working with float data types and integers, but the value NaN may also appear, which will give you headaches if you don’t know how to fix the problem at hand.

A NaN is defined as “Not a Number” and represents missing values in the data. If you are familiar with SQL, it is a similar concept to NULLS.

So how does this error occur in Python?

Let’s look at some logic below:

NaN =float('NaN')
print(type(NaN))
print(NaN)

Result:
<class 'float'>
nan

As can be seen, we have a variable called ‘NaN’, and it is a data type ‘Float’

One of the characteristics of NaN is that it is a special floating-point value and cannot be converted to any other type than float; thus, when you look at the example below, it shows us this exactly and why you would get the error message we are trying to solve.

NaN =float('NaN')
print(type(NaN))
print(NaN)

a= int(NaN)

print(a)

Result:

Traceback (most recent call last):
  File "ValueError_cannot_convert_float_NaN_to_integer.py", line 5, in <module>
    a= int(NaN)
ValueError: cannot convert float NaN to integer

In the variable ‘a’ we are trying to make that an integer number from the NaN variable, which, as we know, is a floating-point value and cannot be converted to any other type than float.

How do we fix this problem?

The easiest way to fix this is to change the ‘NaN’ actual value to an integer as per the below:

NaN =float(1)
print(type(NaN))
print(NaN)

a= int(NaN)

print(a)
print(type(a))

Result:
<class 'float'>
1.0
1
<class 'int'>

So, in summary, if you come across this error:

  1. Check to see if you have any ‘Nan’ values in your data.
  2. If you do replace them with an integer value, or a value that you need for your project, that should solve your problem.

How to Create an XML file from Excel using Python

Estimated reading time: 3 minutes

Are you working on a data analytics project where you need to feed your data to a location that is able to process an XML file?

The ability to get your data into a structured format like XML has many benefits:

(A) You can transfer the data to a web service for processing.

(B) Multiple different formats of your raw data can be standardised, enabling quick conversion and processing.

(C) XML files can be read by multiple different programs, as long as you deliver them in the correct format.

(D) The receiver of data can easily read the XML file and store it on their database.

The ability to use this method to read and transfer data is a very powerful way to help a data analyst process large data sets.

In fact, if you are using cloud-based applications to analyse the information you are storing, this will quickly enable you to deliver the data.

What packages do I need in Python?

The first step is to import the following:

import pandas as pd
from lxml import etree as et

Next we want to read in the source data

In this instance, we are reading an excel file

raw_data = pd.read_excel(r'Link to where your data is stored including the full file name')

Now we want to start building the XML structure

The FIRST STEP is to define the root

root = et.Element('root')

The root is the parent of all the data items (tags) contained in the XML file and is needed as part of the structure

The SECOND STEP is to define the tag names that will store each row of the source data

for row in raw_data.iterrows(): ==> This is a loop that takes runs through each record and populates for each tag.
    root_tags = et.SubElement(root, 'ExportData') #=== > Root name
# These are the tag names for each row (SECTION 1)
    Column_heading_1 = et.SubElement(root_tags, 'Name')
    Column_heading_2 = et.SubElement(root_tags, 'Area')
    Column_heading_3 = et.SubElement(root_tags, 'NoPurchases')
    Column_heading_4 = et.SubElement(root_tags, 'Active')

###These are the values that will be populated for each row above
# The values inside the [] are the raw file column headings.(SECTION 2)
    Column_heading_1.text = str(row[1]['Name'])
    Column_heading_2.text = str(row[1]['Area'])
    Column_heading_3.text = str(row[1]['No Purchases'])
    Column_heading_4.text = str(row[1]['Active'])

The raw file looks like this:

NameAreaNo PurchasesActive
JohnDublin2Y
MaryGalway3N
JoeLimerick4N
JimmyKilkenny55Y
JenniferBelfast6N
SusanWaterford3Y
JakeCork1Y
BobbyDundalk11N
SarahSligo9N
CianEnnis8Y
Raw file data that will be imported into the XML file

The THIRD STEP is to create the XML file and populate it with the data from the source file

# This Section outputs the data to an xml file
# Unless you tell it otherwise it saves it to the same folder as the script.
tree = et.ElementTree(root) ==> The variable tree is to hold all the values of "root"
et.indent(tree, space="\t", level=0) ===> This just formats in a way that the XML is readable
tree.write('output.xml', encoding="utf-8") ==> The data is saved to an XML file

The XML output should look like the below

<root>
	<ExportData>
		<Name>John</Name>
		<Area>Dublin</Area>
		<NoPurchases>2</NoPurchases>
		<Active>Y</Active>
	</ExportData>
	<ExportData>
		<Name>Mary</Name>
		<Area>Galway</Area>
		<NoPurchases>3</NoPurchases>
		<Active>N</Active>
	</ExportData>
	<ExportData>
		<Name>Joe</Name>
		<Area>Limerick</Area>
		<NoPurchases>4</NoPurchases>
		<Active>N</Active>
	</ExportData>
	<ExportData>
		<Name>Jimmy</Name>
		<Area>Kilkenny</Area>
		<NoPurchases>55</NoPurchases>
		<Active>Y</Active>
	</ExportData>
	<ExportData>
		<Name>Jennifer</Name>
		<Area>Belfast</Area>
		<NoPurchases>6</NoPurchases>
		<Active>N</Active>
	</ExportData>
	<ExportData>
		<Name>Susan</Name>
		<Area>Waterford</Area>
		<NoPurchases>3</NoPurchases>
		<Active>Y</Active>
	</ExportData>
	<ExportData>
		<Name>Jake</Name>
		<Area>Cork</Area>
		<NoPurchases>1</NoPurchases>
		<Active>Y</Active>
	</ExportData>
	<ExportData>
		<Name>Bobby</Name>
		<Area>Dundalk</Area>
		<NoPurchases>11</NoPurchases>
		<Active>N</Active>
	</ExportData>
	<ExportData>
		<Name>Sarah</Name>
		<Area>Sligo</Area>
		<NoPurchases>9</NoPurchases>
		<Active>N</Active>
	</ExportData>
	<ExportData>
		<Name>Cian</Name>
		<Area>Ennis</Area>
		<NoPurchases>8</NoPurchases>
		<Active>Y</Active>
	</ExportData>
</root>

Additional XML data can be added

  1. Add more rows – All you need to do is add onto the source file and save. When you rerun the logic it will read in the extra information.
  2. Add more columns – All you need to do is go to the second step above add in a tag name to SECTION 1. Seperately you will need to add an additional column with data to the source file, and then add that column name to SECTION 2 as well

How to Add Formulas to Excel using Python

Estimated reading time: 3 minutes

You may be working on automating some exports to Excel using Python to compare files or just pure and simple adding formulas to an Excel file before you open it up.

Here we explain adding formulas to your Excel output using Numpy or adding the calculations to specific cells in the output.

Adding formulas to specific cells

First of all, let’s look at the normal spreadsheet with some calculations, these have the formulas typed in. The ultimate objective is to have the Python code do this for us, one less step.

As can be seen, the cells have the formulas in them, but this would be a very time-consuming process if you had to do it multiple times, in multiple spreadsheets.

To get around this we can write the Python logic as follows:

  1. Create three lists and three dataframes as follows.
datasetA_list = np.array([1,2,3,4,5,6,7,8,9,10])

datasetB_list = np.array([9,8,65,43,3,21,3,2,1,7])

dataset_list = ('sum','average','median','standard deviation','count','correlation')


datasetA = pd.DataFrame(datasetA_list,columns=['ValueA'])
datasetB = pd.DataFrame(datasetB_list,columns=['ValueB'])
dataset_list_calcs = pd.DataFrame(dataset_list, columns=['Calcs'])

2. Next create a path to where you are going to store the data as follows:

path = 'output.xlsx'

3. In this next step create the workbook and location where the data will be stored. This will load the headings created in step 1 to a particular location on the spreadsheet.

workbook = pd.ExcelWriter(path, engine='openpyxl')
workbook.book = load_workbook(path)
workbook.sheets = dict((ws.title,ws) for ws in workbook.book.worksheets)

datasetA.to_excel(workbook,sheet_name="Sheet1", startrow=1,index=False, header=True,)
datasetB.to_excel(workbook,sheet_name="Sheet1", startrow=1, startcol=2,index=False, header=True)
dataset_list_calcs.to_excel(workbook,sheet_name="Sheet1", startrow=1, startcol=4,index=False, header=True)

4. Load the formulas into cells besides their relevant headings. This should line post these formulas beside the relevant heading created in step 1.

###Creating calculations for datasetA

sheet = workbook.sheets['Sheet1']
sheet['E2'] = 'CalcsA'
sheet['F3'] = '=SUM(A3:A12)'
sheet['F4'] = '=AVERAGE(A3:A12)'
sheet['F5'] = '=MEDIAN(A3:A12)'
sheet['F6'] = '=STDEV(A3:A12)'
sheet['F7'] = '=COUNT(A3:A12)'
sheet['F8'] = '=CORREL(A3:A12,C3:C12)'

###Creating calculations for datasetB

sheet = workbook.sheets['Sheet1']
sheet['H2'] = 'CalcsB'
sheet['H3'] = '=SUM(C3:C12)'
sheet['H4'] = '=AVERAGE(C3:C12)'
sheet['H5'] = '=MEDIAN(C3:C12)'
sheet['H6'] = '=STDEV(C3:C12)'
sheet['H7'] = '=COUNT(C3:C12)'
sheet['H8'] = '=CORREL(A3:A12,C3:C12)'

Use Numpy to create the calculations

a. Create the calculations that you will populate into the spreadsheet, using Numpy

a = np.sum(datasetA_list)
b = np.average(datasetA_list)
c = np.median(datasetA_list)
d = np.std(datasetA_list,ddof=1) ## Setting DDOF = 0 will give a differnt figure, this corrects to match the output.
f = np.count_nonzero(datasetA_list)
g = np.corrcoef(datasetA_list,datasetB_list)

b. Create the headings and assign them to particular cells

sheet['E14'] = 'Numpy Calculations'
sheet['E15'] = 'Sum'
sheet['E16'] = 'Average'
sheet['E17'] = 'Median'
sheet['E18'] = 'Standard Deviation'
sheet['E19'] = 'Count'
sheet['E20'] = 'Correlation'

c. Assign the variables in step a to a set of cells

sheet['F15'] = a
sheet['F16'] = b
sheet['F17'] = c
sheet['F18'] = d
sheet['F19'] = f
sheet['F20'] = str(g)

d. Save the workbook and close it – This step is important, and always include.

workbook.save()
workbook.close()

And the final output looks like…