How to check if a file is empty

Estimated reading time: 2 minutes

Ever wondered how to go about checking if a file is empty?

A problem you may come across in Data Analytics that when you are importing a file as outlined in this post Python – How to import data from files is how do we know if the files are empty or not before import?

In the world of data, there are several reasons to check :

  • You have an automated process relying on the import not been empty.
  •  A process that preceded you receiving the file did not work.
  • The amount of time and effort to investigate the problem causes undue work to fix.

The nuts and bolts of it all

Here we have a video that looks at different scenarios on how to bring in files. The following functionality appears in this video:

  • os.path.getsize – This looks to get the file size attached to the file. * Please see note below
  • pd.read_csv
  • pd.read_excel

The add on bits

*One note about os.path.getsize, which we found:

  • It only works in the logic provided if the size of the file is zero KB.
  •  CSV and XLSX files even though they where created empty, when saved had a file size greater than zero.
  •  TXT files, when created empty and saved, had a file size of zero.

 

Hope this video helps explain further how empty files can be checked in python before they are processed.

Thanks!

Data Analytics Ireland

How to remove characters from an imported CSV file

Estimated reading time: 2 minutes

Removal of unwanted errors in your data, the easy way.
The process of importing data can take many formats, but have you been looking for a video on how you do this? Even better are you looking for a video that shows you how to import a CSV file and then data cleanse it, effectively removing any unwanted characters?

As a follow up to Python – how do I remove unwanted characters, that video focused on data cleansing the data created within the code, this video runs through several options to open a CSV file, find the unwanted characters, remove the unwanted characters from the dataset and then return the cleansed data.

How to get in amongst the problem data:

The approach here looked at three different scenarios:

(A) Using a variable that is equal to an open function, and then reading the variable to a data frame.

(B)Using a “with statement” and an open function together, and returning the output to a data frame.

(C) Using read_csv to quickly and efficiently read in the CSV file and then cleanse the dataset.

Some minor bumps in the road that need some thought

There where some challenges with this that you will see in the video:

  • Options A & B had to deploy additional code just to get the data frame the way we wanted.
  •  The additional lines of code made for more complexity if you were trying to manage it.

In the end, read_csv was probably the best way to approach this; it required less code and would have been easier to manage in the longer run.

 

As always thanks for watching, please subscribe to our YouTube channel, and follow us on social media!

Data Analytics Ireland

 

YouTube channel lists – Python working with files

Have you ever wondered how to work with files and get access to them and look at their contents?

In today’s environment, there is a frequent need to work with files, import do some work on them and then either save them somewhere or pass them onwards for further processing.

Working with files can happen in several ways:

  • You import them from a local or network drive.
  • The files are transmitted electronically to be stored on a destination server and then processed.
  • They could be received via email and need to be extracted and processed.

The videos in this data analytics series will hopefully help to explain the different concepts and ways of working with files; you may need to address or write a programme to address.

https://www.youtube.com/embed?listType=playlist&list=PL2nlwZUZ5tFLTwTqKHi8cHY8ZH34HQSJM

We hope you like and please subscribe through the social media buttons on the page if you want to hear more from us!

Data Analytics Ireland

 

Python Tutorial: How to import data from files

Estimated reading time: 1 minute

Is there a need for you to be quickly open files, and import the data into a data frame?

In this post and video on Python, we will look at several options for you to do this as well as some additional things to consider.

The import of files covered here is as follows:

  • Reading data from a CSV file.
  • Reading data from a TXT file.
  • Reading data from an XLSX file.

On importation there are many things to consider, here are a few to consider:

(A) The file format

(B) How the data looks within the file.

(C) Special requirements to get the data looking correct when loaded.

In this file importing example dealing with tab delimiters, headers and sorting are referenced.  Here are some different ways to approach it a little differently if you are looking for alternatives CSV File Reading and Writing

Thanks for watching, please follow us by clicking on the links to the right!

Need to check if a file is empty? Have a look here Python – How to check if a file is empty

Thanks!

Data Analytics Ireland