Removal of unwanted errors in your data, the easy way.
The process of importing data can take many formats, but have you been looking for a video on how you do this? Even better are you looking for a video that shows you how to import a CSV file and then data cleanse it, effectively removing any unwanted characters?
As a follow up to Python – how do I remove unwanted characters, that video focused on data cleansing the data created within the code, this video runs through several options to open a CSV file, find the unwanted characters, remove the unwanted characters from the dataset and then return the cleansed data.
How to get in amongst the problem data:
The approach here looked at three different scenarios:
(A) Using a variable that is equal to an open function, and then reading the variable to a dataframe.
(B)Using a “with statement” and an open function together, and returning the output to a dataframe.
(C) Using read_csv to quickly and efficiently read in the CSV file and then cleanse the dataset.
Some minor bumps in the road that need some thought
There where some challenges with this that you will see in the video:
- Options A & B had to deploy additional code just to get the dataframe the way we wanted.
- The additional lines of code made for more complexity if you were trying to manage it.
In the end, read_csv was probably the best way to approach this; it required less code and would have been easier to manage in the longer run.
As always thanks for watching, please subscribe to our YouTube channel, and follow us on social media!
Data Analytics Ireland