Data Science Project from Scratch - Part 3 (Data Cleaning)

Показать описание

This is part 3 of the Data Science Project from Scratch Series. In this video I go through how to clean up your data to make it usable for exploratory data analysis (EDA) and model building.

Data cleaning is an extremely important and often overlooked step in the data science lifecycle. Python has some handy functions that allow you to parse and replace data relatively easily. You can also use regular expressions to do this; however, those are a bit beyond the scope of this video. I mostly use lambda functions because I think that this is the simplest approach.

The first thing that we clean is the data science salary. We need to make sure that it is numeric because we are using that as our dependent variable. We also want to go through and do some light feature engineering. We can get some info about the state of the job postings and the nature of the job postings themselves.

I went through and looked to see if the postings had python, r-studio, spark, aws, or excel listed and added those as features.

Again, this is an iterative approach that is rather messy. Please stay tuned for part 4 of the series EDA!

#DataScience #KenJee #DataScienceProject

Partners & Affiliates

MORE DATA SCIENCE CONTENT HERE:

Check These Videos Out Next!

My Playlists

Рекомендации по теме

Комментарии

I really like that you don't edit things out. I think the process is much more informational than just the result. This is a good mini-series man, keep it up!

Gyninku

I used to Hate Lambda function, but really I got good understanding from this Video,
Thank you so Much

samehsayed

I really shouldn’t have laughed at 30:16 when Ken’s like “yea sorry about the sirens. You know, tough times out there”. But it’s nice comical relief when you’re slogging through code like that. Ken you do a GREAT job at explaining the process end to end! I love this mini-series.

christianscodecorner

Was nice to see near the end of the video that even you too had to look something up! Further backed up your video saying that no real data scientist memorizes everything. We appreciate you for all that your doing!

Mario-oxdm

This is outstanding content. Watching this series before I have started on a project will save me so much time.

gwbraders

I am absolutely LOVING this series! I've been studying for a while and always wondered how an actual data scientist would use all these tools on an actual project. Great work Ken! Just found your channel today and I have a feeling I'll go through your videos really fast

rayneto

This series is exactly what I wanted and I highly recommend to anyone who is entering in Data Science.

chirag

Excellent work, Ken. Your 'soup to nuts' approach is extremely helpful to those like me who are brand new to data science. On top of everything, I like how you show us your use of Git and GitHub as well as all of the lambda functions. Keep it up!

MichaelCruz-rchb

Things every Data science beginner needed.
Thanks Man. Keep it up :)

salikmalik

First went through a couple of minutes of the video where you discussed what had to be done and started solving it by my own. Once I was done, came back here to check out how you'd solved. This was immensely helpful as my code was not as efficient, and learnt better approaches to the same problem! Thanks Ken!

ashikka

I love your commitment towards teaching
replying and liking each and every comment is not so easy
but that's what making you special
keep growing and keep sharing knowledge
I hope one day you will reach your expectations
thank you alooootttt

karthikc

Data cleaning process is a very important step and can be very tough at times depending on how messy our dataset is. Thanks for the detailed video Ken!

importdata

Hi Ken, I've been following your work quite some time now. The way you keep your online presence is inspirational. I would love it if you have more step-by-step project videos on Youtube. There are so many areas in which I don't even know if analytics are applied. Sport analytics is interesting for me for example. Also, projects with more practical implications like the regular churn model or HR analytics would add value to your channel, too. I would enjoy watching them on my end at least.

turquoisetravels

Hi Ken... This is really very informative, for me as a newbie in DS. Glad to see that as comfortable as you are using the lambda instead of RegEx with this one to clean up data, yet had to google how to drop a column. :) I thought that was very cool that you didn't edit it out and makes me feel better. Very encouraging... keep up! Thanks a lot for sharing.

limeyboo

I still don't have enough experience in using python but am amazed in all the "magic' that it can do! Great job, Ken! You're amazing! :)

miguelrosales

Alternate title: "Part 3 (Lots of lambda functions)" haha! Great video Ken, really enjoyed the on-the-spot feature engineering

Sambungus

You have ended the hate I have had for Lambda functions (just because I couldn't understand them) in the first 20 mins of the video. Thank you!

joseenrique

I'm a DS Major and this is very helpful. I can see your channel blowing up when all the software engineers are switching over to DS! lol

Itsdanielpeng

This is just amazing Ken ! now I know that's the kind of job I wanna do for a living :) thank you so much for sharing !

elyazidassade

A good series explaining the data science process which a lot of videos and articles ignore. This video in particular would be the time to mention using existing and/or creating a codebook for the dataset.

xA

Data Science Project from Scratch - Part 3 (Data Cleaning)

Data Science Project from Scratch | Simple Data Science Project | Intellipaat

Data Science Project from Scratch - Part 1 (Project Planning)

Step By Step Understanding Of Implementing Data Science Project

Data Science Project from Scratch - Part 2 (Data Collection)

Data Science Project Demo with Data Scientist Melissa Phillips

Data Science Projects For Resume | Machine Learning Projects With Source Code

Data Science Project from Scratch - Part 3 (Data Cleaning)

5 Impactful Data Science Projects For Your Portfolio

AI & Data Science Chat With A Chief Data Scientist - LIVE AI & Data Science Discussion

How to create your first data science project

Machine Learning & Data Science Project - 1 : Introduction (Real Estate Price Prediction Project...

Data Science Project from Scratch - Part 6 (Putting the Model into Production)

Build 12 Data Science Apps with Python and Streamlit - Full Course

How To Build Ideas For Real World Data Science Projects For Interviews

🔥 Top 10 Data Science Projects For 2024 | 10 Data Science Projects For Beginners | Simplilearn

Data Science Project - Covid-19 Data Analysis Project using Python | Python Training | Edureka

What Should Be A Proper Structure Of Your Data Science Projects In Github?

4 Step Data Science Project from SCRATCH

End To End Data Science Project Implementation In One Shot- Part 1

Data Science Roadmap 2024 | Data Science Weekly Study Plan | Free Resources to Become Data Scientist

Machine Learning & Data Science Project - 2 : Data Cleaning (Real Estate Price Prediction Projec...

Data Science In 5 Minutes | Data Science For Beginners | What Is Data Science? | Simplilearn

Data Science Projects for Resume|Top 5 Data Science Projects for Your Resume| Simplilearn

Data Science Project from Scratch - Part 4 (Exploratory Data Analysis)