Regular expressions python

Estimated reading time: 3 minutes

Regular expressions explained

Regular expressions are a set of characters usually in a particular sequence that helps find a match/pattern for a specific piece of data in a dataset.

The purpose is to allow a uniform of set characters that can be reused multiple times, based on the requirements of the user, without having to build each time.

The patterns are similar to those that you would find in Perl.

How are regular expressions built?

To start, in regular expressions, there are metacharacters, which are characters that have a special meaning. Their values are as follows:

. ^ $ * + ? { } [ ] \ | ( )

.e = All occurrences which have one “e”, and value before that e. There can be multiple e, eg ..e means check two characters before e.

^ =Check if a string starts with a particular pattern.

*  = Match zero or more occurrences of a pattern, at least one of the characters can be found.

+ = Looks to match exact patterns, one or more times, and if they are not precisely equal, then nothing is returned.

? =Check if a string after ? exists in a pattern and returns it. If a value before the ? is directly beside the value after ? then returns both values.

—> e.g. t?e is the search pattern. “The” is the string. The result will return only the value e, but if the string is “te”, then it will return te, as the letters are directly beside each other.

da{2} = Check to see if a character has a set of other characters following it. E.g. sees if d has two “a” following it.

[abc] = These are the characters you are looking for in the data. Could also use [a-c] and will give you the same result. Change to uppercase to get only those with uppercase.

\ = Denoting a backslash used to escape all metacharacters, so if they need to be found in a string, they can be. Used to escape $ in a string so they can be found as a literal value.

| = This is used when you want an “or” operator in the logic, i.e. check for one or more values from a pattern, either or both can be present.

() = Looks to group pattern searches or a partial match, to see if they are together or not.

 

Special sequences, making it easier again

\a = Matches if the specified characters are at the start of the string been searched.

\b = Matches if the specified characters are at the beginning or the end of the string been searched.

\B = Matches if the specified characters are NOT at the beginning or the end of the string been searched.

\d = Matches any digits 0-9.

\D = Matches any character is not a digit.

\s = Matches where a string contains a whitespace character.

\S = Matches where a string contains a non-whitespace character.

\w = Matches if digits or character or _ found

\W = Matches if non-digits and or characters or _found

\z = matches if the specified characters are at the end of the string.

 

 

For further references and reading materials, please see the below websites, the last one is really useful in testing any regular expressions you would like to build:

See further reading material here: regular expression RE explained

Another complementary page to the link above regular expression REGEX explained

I found this link on the internet, and would thoroughly recommend you bookmark it. It will also allow you to play around with regular expressions and test them before you put into your code, a very recommended resource Testing regular expressions

 

Python tutorial: How to create a graphical user interface in Tkinter

How would you like to present your data analytics work better?

When starting your data analytics projects, one of the critical considerations is how to present your results quickly and understandably?

Undoubtedly this is true if you are only going to look at the results yourself.

If the work you do is a repeatable process, a more robust longer-term solution needs to be applied, this is where Tkinter can help, which is a python graphical user interface.

There are many applications for using Tkinter, such as:

  • Use them to build calculators.
  • They can show graphs and bar charts.
  • Show graphics on a screen.
  • Validate user input.

Where this all fits in with data analytics?

While going through a set of data and getting some meaning to it can be challenging, using the python graphical user interface tutorial below can help build the screens that will allow a repeatable process to display in a meaningful way.

Ultimately, you could do the following:

  • Build a screen that shows data analytics errors in a data set, e.g. The number of blank column values in a dataset.
  • Another application is to run your analytics to show the results on a screen that can be printed or exported.
  • Similarly, you could also have a screen where a user selects several parameters that are fed into the data analytics code and produces information for the user to analyse.

There are many more ways that you could do this, but one of the most important things is that data analytics can be built into a windows environment using Tkinter that the user would be used to seeing. As a result, this could help to distribute a solution across an enterprise to lots of different users.

The only thing that needs to happen is that the requirements the user needs are defined, and the developer then builds on those, with the data analytics code run in the background of this program with Tkinter and output into a user-friendly screen for review.

Tkinter python tutorial

Estimated reading time: 3 minutes

Let’s make the introductions 🙂
Tkinter is a package that allows a programmer to build a GUI interface, which then can be opened on a computer screen by a user. There are many different types of GUI apps, but examples include a calculator or a text editor that opens when you click it.

Tkinter would be the most commonly used GUI package in Python, due to its simplicity, but PySimpleGUI, PYQt or PySide are other alternatives. Ensure you research these before using to make sure they suitable for your needs.

Why use Tkinter?

  • Relatively simple and easy to learn, upskilling is quick.
  • A great introduction to the concepts and ideas for building GUI apps, you will get a good grounding in the techniques and approaches needed.
  • Very well documented, so a programmer should be able to find the answer to anything specific they need to understand.

Now we are introduced, let’s see how to utilise it:

Install Python as usual, and make sure that Tkinter is working and you have the correct version. Note that import tkinter is for version 3.x, before that use import Tkinter

Please note that you will see in places when using Python code, that capitalization is important. This will sometimes puzzle you as to why some of your code does not work, usually, the interpreter should flag it for you.

While it is correct to put a capital at the start of a line, the programming language will ignore written English convention. An example is as follows:

list = ['a','b','c']
Print(list)
gives 
NameError: name 'Print' is not defined
whereas
list = ['a','b','c']
print(list)
gives ['a', 'b', 'c'] 
No errors

When saving your python script DO NOT call it tkinter.py as I did, the import statement will not work. Call it something like tkinter_test.py, see red arrow below.

At the start of the video below the code will look like this. the first six lines are the creation of the Tkinter screen its size and any buttons that will appear on it.

Note that all code should appear betwee line two and line six, so that the screen output works and looks correctly.

Added to this code in the following video:

  • Button – which will open our YouTube channel
  • An image
  • A clickable link – Which will bring you to our Home Page

A screenshot of the final output is as follows:

See a link to the Python documentation here Tkinter on python.org