Complete Guide to Cross Validation

preview_player
Показать описание
In this video Rob Mulla discusses the essential skill that every machine learning practictioner needs to know - cross validation. We go through examples of scikit learn cross validation in python code. sklearn has many of these built in. Without cross validation it's easy to overfit your model and overstate it's predictive power. This video is a must watch for anyone trying to learn machine learning.

Timelime:

00:00 Intro
01:37 Setup
03:41 The Dataset
06:56 The wrong way
10:20 Holdout check and baseline
12:50 Train/Test Split
15:25 Cross Validation
24:14 Applying Cross Validation

#python #machinelearning #datascience
Рекомендации по теме
Комментарии
Автор

Hi Rob, thanks for the nice explanation of cross-validation. After you run the cv method, what do you use as your final model to predict new data? My 1st thought is to use the best performing fold, but that seems to defeat the purpose of cross-validation and would be prone to overfitting. Would you use all 5 folds then average the predictions similar to how you calculated the out of fold AUC score? Or perhaps just train on all your data since you have an idea that your model will perform at around 0.83 AUC with unseen data?

casualgamer
Автор

The best explanation of CV on YouTube, Rob is an ML beast, thank you.

MOAMA
Автор

This is the only video I've found on net which explains ofcourse crossvalidation part but also how to separate and divide our data into different sets. Because in most of the tutorials and articles that Ive found, they divide data into just two parts and just perform evaluation on test set which is wrong but here I've found proper explanation of whole process. Awesome video.
For future I would love to know how we can apply different classification metrics when we need to have a high recall (for example in case of cancer predictions) or high precision etc.
Again thank you for the detailed explanation

gauravmalik
Автор

Wow, thanks. I'm going thru ML on the theoretical side rn and it's refreshing to see such applied content! It's a long road ahead, but I believe that if you keep posting vids of such a) relevancy and b) quality with a good c) frequency in a couple of months you will be much bigger. Thanks again!

davidm
Автор

Really great tutorial, so thorough and simple to understand. You're a natural tutor

kalianeeboodoo
Автор

this is the best CV explanation I've ever watched and finally clear my confusion, thanks a lot sir

ye-ymjo
Автор

Man, i really love your coding performance and all your explanations. You helped me a lot.

robinsonrios
Автор

Rob, you are so great! I can come over to your videos many times and never get bored, I need more teachers like you lol, you are a guy who really improves every day, Thanks for supporting the community

eduardomanotas
Автор

this is pure gold! thanks for this awesome content !!

ronbzalen
Автор

This is amazing. It was so helpful. Thank you Rob!

김성원-jk
Автор

Excellent and elegant flow of concepts and implementatiom.

srimantamukherjee
Автор

❤❤❤ You are the one from my best proffessors

АлексейЖеребцов-би
Автор

This is superb, I wish to be like Rob one day.

raheemnasirudeen
Автор

Finally found a video series on data science that I can understand. Thank you!

thespace
Автор

State of the art for cross validation.

oilgas
Автор

This is quite helpful. Thank you for this

dgallas
Автор

GroupTimeSeriesSplit!
It's implemented in mlxtend and sklego libraries and seen in some kaggle notebooks and stackoverflow answers.

JordiRosell
Автор

lol truly a master class, thanks Rob xD

ashraf_isb
Автор

Hey rob ...great explanation on the cross fold technique....can you do a further video on how to apply these techniques in case of deep learning model for image classification problem...it will be super helpful

poojagoyal
Автор

Amazing video. You should do a time series one next.

koleshjr