
Data cleaning is the process of preparing data for analysis by removing or modifying data that is incorrect, incomplete, irrelevant, duplicated, or improperly formatted.
Requirements
- python
- pandas
- numpy
Description
According to IBM Data Analytics you can expect to spend up to 75% of your time cleaning data. Using Python’s Pandas library, we’ll walk through a range of various data cleaning tasks. Specifically, we will concentrate on perhaps the largest job, missing values, for data cleaning. Data cleansing may be performed interactively with data wrangling tools, or as batch processing through scripting.
