Data Cleaning by Juicy Fish from Noun Project
Data cleaning, also known as data cleansing or data scrubbing, is the process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in datasets. Data cleaning is an important step in the data analysis process, as it ensures that the data being analyzed is accurate, complete, and consistent.
Data cleaning involves several tasks, including:
- Removing duplicate data: Duplicate data can cause issues in analysis, so it is important to identify and remove any duplicate data.
- Handling missing data: Missing data can be handled by either removing the missing data or imputing it with values that are reasonable or estimated based on other data.
- Correcting data errors: Data errors can be corrected by identifying and correcting any inconsistencies or inaccuracies in the data.
- Handling inconsistent data: Inconsistent data can be handled by identifying and correcting any inconsistencies or errors in the data, such as formatting errors or data values that are out of range.
- Standardizing data: Data standardization involves ensuring that data is represented in a consistent manner, such as using a common date format or ensuring that all data values are in the same units.
![]() |
Text MechanicManipulate text in the browser to complete basic cleaning functions. | FreeBrowser | ![]() |
Data Cleaning, Text Analysis |
![]() |
SaltyCreate a messy dataset for your students to clean up through instruction. | Open SourceLinux, Mac, Windows | ![]() |
Data Cleaning, Learning |
![]() |
Open RefineClean and transform data in your desktop with this powerful, open tool. | Open SourceLinux, Mac, Windows | ![]() |
Data Cleaning |
![]() |
RegExr RegExr is a browser-based tool that allows users to build, test and share regular expressions. |
FreeBrowser | ![]() |
Data Cleaning, Learning, Programming |
![]() |
BreveView and manipulate "messy" tabular data in order to better identify errors and gaps. | FreeMac, Browser | ![]() |
Data Cleaning, Visualizations |