298 - What is k fold cross validation?

preview_player
Показать описание
298 What is k fold cross validation?

Cross validation (according to Wikipedia) is a resampling method that uses different portions of the data to test and train a model on different iterations.​

In machine learning, cross validation is used to compare various models and its parameters.​

K-fold is a specific data sampling method that splits data for training and testing used for cross validation. ​

K refers to the number of groups the data gets split into. ​
Рекомендации по теме
Комментарии
Автор

That was very informative. You have both depth of knowledge and a gift for teaching. Thanks.

newcooldiscoveries
Автор

Thank you Sreeni for all of your great videos. I have a suggestion, since the start of the channel we we have been learning the different ML/DL algorithms and their applications using images. Can you please consider making a series on how to apply them on biomedical signals ? Thank you

ausialfrai
Автор

Super sir 👍 eagerly waiting for the coding section

vidyasvidhyalaya
Автор

Sir....can you please upload a separate video related to FEATURE EXTRACTION USING "SURF" Algorithm for image classification?

vidyasvidhyalaya
Автор

u the GOAT !!!.... up there with Tom Brady and MJ

trapbushali
Автор

Thanks Screeni! I wanna ask if we should removal outliers before split or after the split/cv?

DataAnalytics
Автор

Please make video on ensemble model on deep learning, where you will need to compile and the out of base models for the ensemble model

bitugmasamuel
Автор

Love your content. Please keep it coming. I have a few doubts and would appreciate @DigitalSreeni/communities thoughts. 1) Don't you think that any preprocessing should be happening within the loop of cv (@ 13.35) to avoid data leakage. Essentially in the loop (say for 5 fold cv) 4 folds are for training and 5th fold for testing. If you normalized/scaled data outside the loop - this should constitute data leakage, right? 2) where to encode categorical features - before split, after split or within for loop? 3) when we want to get the final model for production - we consider the entire data (train + test). All the preprocessing that we have done while doing cv will be executed on this entire dataset, right? i.e. if standardization was used while performing cv now for the final model and for future data preprocessing the mean and standard deviation will come from this (train + test) data, correct?

Nishant
Автор

Really informative video as I now also learning more about the cross validation. One question, so after doing the cross validation, how we should develop/train the final model?

luthfanhabibi
Автор

scaling before splitting oops, I think I have made that mistake more than once. 😅

wg