Skip to content
  • YouTube
  • FaceBook
  • Twitter
  • Instagram

Data Analytics Ireland

Data Analytics and Video Tutorials

  • Home
  • Contact
  • About Us
    • Latest
    • Write for us
    • Learn more information about our website
  • Useful Links
  • Glossary
  • All Categories
  • Faq
  • Livestream
  • Toggle search form
  • Python Tutorial: How to create charts in Excel Python Tutorial
  • How to remove unwanted characters Python Data Cleansing
  • How to group your data in Tableau data visualisation
  • How to use wildcards in SQL SQL
  • What are measures in Tableau? data visualisation
  • Python Tutorial: Pandas groupby columns ( video 2) Python
  • TypeError: ‘list’ object is not an iterator Python
  • R tutorial – How to sort lists using rstudio R Programming

data cleansing in a business environment

Posted on April 16, 2020December 29, 2020 By admin No Comments on data cleansing in a business environment

I was on LinkedIn recently, and looking at my profile, saw a post I had posted about six years ago around Data Cleansing. One thing that struck me was that the topics I brought up then are as relevant then as they are now, and with Big Data now mainstream many companies are wondering how to manage all this data in an ever-changing landscape. So I thought I would share it again.

The Business  case

(A)Test Data meets Industry requirements.

In some industries, it is a legal requirement to have all your data displaying the correct format and description. Any pieces of information not included should be removed based on business rules. Today companies operate across multiple platforms electronic, print, video, etc., a process needs to be in place to make sure the data is in sync!

(B)Check for unwanted words appearing

Branding and reputation are critical, and businesses large and small need a mechanism to understand what information was written online in conjunction with any of their profile. Data Cleansing can be the first point of call to unwanted words that will damage the brand.

The Technical case

(A)Remove unwanted characters such as !”£$%^&*@’;:#~?>< etc.

When presented with a set of data from another source, they may be in a raw format, and if looking to grouping the words or numbers in your dataset, this can sometimes lead to wrong grouping.

(B)Group Data

Sometimes you will need to group specific names or numbers to see how often they appeared. Having this issue can become problematic if an initial review of the data was not started, as with point A above. So you want to check for Facebook in your dataset and if there are six occurrences of it 4* Facebook! and 2*Facebook, your grouping will be incorrect, giving you the wrong analysis.

(C)Prepare Data

The main reason for cleansing data is to have it ready to be processed. Often or not, the process is an initial step before further processing starts. The hard processing will have built-in controls to make sure the data is in the correct format if not, they will fail. This step would be crucial in an automated process.

(D)Check for Null Values

Sometimes a system can be set up to process data or receive in data from a third party vendor. It might be imperative that specific fields should not be empty or should be empty depending on the business need. The initial analysis to identify those values through the data cleansing process should help to mitigate any problems before the data gets loaded into systems that have strict controls on them.

Articles Tags:Data, Data Analytics, Data Cleansing, Testing

Post navigation

Previous Post: YouTube channel lists – Tuples
Next Post: Python Tutorial: How to sort lists

Related Posts

  • What is data analytics and why it is important Articles
  • How do I fix TypeError: unhashable type: ‘list’ Error? Articles

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Select your language!

  • हिंदी
  • Español
  • Português
  • Français
  • Italiano
  • How can I filter my data in Tableau? data visualisation
  • R – How to check a file exists and is not empty R Programming
  • how to build a machine learning model machine learning
  • How to Pass Python Variables to Javascript Javascript
  • How To Pass Data Between Functions Python Functions
  • Welcome to Data Analytics Ireland Livestream
  • How to run Python directly from Javascript Flask
  • TypeError: List Indices Must Be Integers Or Slices, Not Tuple exceptions

Copyright © 2023 Data Analytics Ireland.

Powered by PressBook Premium theme

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT