Introduction to Data Science with Python – Preprocessing Dirty Data with Pandas

Показать описание

This tutorial is about how to efficiently use Pandas (a data manipulation and analysis library built on top of Python Programming Language) for data preprocessing. We will look at how to think through and implement data cleaning tasks. We will formulate hypothesis about the data and justify why the formulation holds. Important concepts in data preprocessing will be discussed:
- Checking the data types of fields
- Dealing with missing values
- Split a single column into multiple independent fields
- Remove irrelevant columns

Improve your data preprocessing skills so that you can quickly get to the insight in your data. Making your analysis error-free and insight-rich depend on how well you pre-process the data.

Some Pandas Methods (Functions) Discussed

Regular Expression Techniques
- captured group
- last character ($ dollar sign)
- lookbehind assertion
- work character (backslash lower case w)
- space character (backslash lower case s)
- question mark quantifier (match zero or one time)
- asterisks quantifier (match zero or more times)

Python Function
- dir – to get a glimpse of the objects in a module

Timestamp
00:00 Intro
01:21 Jupyter and Import Pandas Library
02:18 Read Data into Pandas DataFrame
04:18 Count and Find Characters in a String
09:44 Split Column using Colon as Delimiter
12:16 Data Familiarization
13:30 Multiple Steps to Extract Substring – Regular Expression
16:50 Single Step to Extract Substring – Regular Expression
18:31 Split Column into Multiple Fields
21:50 Data Cleaning
25:10 Conclusion

Download

Рекомендации по теме

Introduction to Data Science with Python – Preprocessing Dirty Data with Pandas

Intro to Data Science: Overview

Intro to Data Science: What is Data Science?

What is Data Science? | Introduction to Data Science | Data Science for Beginners | Simplilearn

Intro to Data Science - Crash Course for Beginners

What is Data Science | Introduction to Data Science in 2 Minutes | Data Science Training | Edureka

Introduction to Data Science

What is Data Science?

Data Science In 5 Minutes | Data Science For Beginners | What Is Data Science? | Simplilearn

AI & Data Science Chat With A Chief Data Scientist - LIVE AI & Data Science Discussion

Intro to Data Science: The Nature of Data

Data Science in 8 Minutes | Data Science for Beginners | What is Data Science? | Edureka

Data Science With R | Introduction to Data Science with R | Data Science Basics with R | Simplilearn

Introduction To Data Science with SAS and Excel Certification | Simplilearn

What Is Data Analytics? - An Introduction (Full Guide)

Data Science Full Course - Learn Data Science in 10 Hours | Data Science For Beginners | Edureka

Data Science Tutorial | Data Science for Beginners | Data Science with Python Tutorial | Simplilearn

Intro to Data Science: Answering Questions with Data

Module 1: Introduction to Data Science for Social Scientists

What is Data Analytics | Data Analytics in 5 Minutes | Intellipaat

Data Scientist vs Data Analyst - Which Is Right For You? (2024)

What REALLY is Data Science? Told by a Data Scientist

Introduction To Data Science With Python Certification | Simplilearn

How To Learn Data Science Smartly?

Data Analytics For Beginners | Introduction To Data Analytics | Data Analytics Using R | Simplilearn