Estimated reading time: 2 minutes
Changing a file is the natural step; tracking those changes are just as important.
Change is part and parcel of life, but in the technology world with the complexity and interdependency of systems, not effectively been able to track what goes on leads to:
- Countless hours are trying to figure out where it went wrong.
- You do not understand what needs fixing.
- Systems/processes that ultimately work seamlessly, slow down unnecessarily.
In data analysis, as the volumes can be quite large, the human cannot feasibly review a set of data and find out where the underlying problem is. Well, they can, but it would take so long, nothing else would get done. This article by Forbes – predictions-about-data-in-2020-and-the-coming-decade predicts the consumption of data will just be getting bigger.
Let the script work, see the log of changes in the output.
How do you remove characters from an imported CSV file, looked at some data cleansing techniques, but there was no way of knowing what was changed other than a visual inspection. Here we introduce the data set into a data frame, change some of the values and show the output on the screen. But more importantly, as we progress through these steps, we are saving the changes as we go along.
The reality is that in large corporate settings, visual inspections would take up too much time and resources. An IT solution to help with giving the vital information required will reduce the data errors happening and allow for a more unobstructed view of how the companies data has changed over time.
Where does the trail lead to next?
- Changes made to data needs a clear way of being able to be tracked.
- How you captured those changes on your systems, needs to be addressed.
- Implementing better systems will help you have confidence in your data changes.