Data Cleansing Master Class in Python

Data preparation may be the most important part of a machine learning project. It is the most time-consuming part, although it is the least discussed topic. Data preparation, sometimes referred to as data preprocessing, is the act of transforming raw data into a form that is appropriate for modeling.

Machine learning algorithms require input data to be numbered, and most algorithm implementations maintain this expectation. Therefore, if your data contains data types and values that are not numbers, such as labels, you will need to change the data into numbers. Further, specific machine learning algorithms have expectations regarding the data types, scale, probability distribution, and relationships between input variables, and you may need to change the data to meet these expectations.

In this course, you will learn data imputation and advanced data cleansing techniques, how to apply real-world data cleansing techniques to your data, advanced data cleansing techniques. Also, learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation.

By the end of this course, you will perform data preprocessing and master data cleaning skills.

The complete code bundle for this course is available at https://github.com/PacktPublishing/Data-Cleansing-Master-Class-in-Python

Type
video
Category
publication date
2021-12-17
what you will learn

Prepare data in a way that avoids data leakage
Identify and handle problems with messy data
Know which feature selection method to choose based on the data types
Transform the probability distribution of input variables
Identify and remove irrelevant and redundant input variables
Project variables into a lower-dimensional space

duration
213
key features
Learn how to apply real-world data cleansing techniques to your data * Learn advanced data cleansing techniques * Learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation
approach
This course is a hands-on guide. It is a playbook and a workbook intended for you to learn by doing and then apply your new understanding to feature engineering in Python. To get the most out of the course, we would recommend working through all the examples in each tutorial. If you watch this course like a movie, you will get little out of it.
audience
This course is for you if you are serious about becoming a machine learning engineer in the real world. You will need a solid foundation in Python and should understand the basics of machine learning. Also, you should have some expertise with machine learning libraries.
meta description
A step-by-step complete guide to become a machine learning engineer
short description
This course is a complete guide to data cleansing for machine learning engineers. In this course, you will learn data imputation and advanced data cleansing techniques.
subtitle
The Complete Guide to Data Cleansing for Machine Learning Engineers
keywords
Python, Data Preparation, Data Cleansing, Machine Learning
Product ISBN
9781803239040