K Fold Cross Validation Understanding with an Example || Lesson 43 || Machine Learning

preview_player
Показать описание
#machinelearning#learningmonkey

In this class, we discuss K Fold Cross Validation Understanding with an Example.
How K fold cross-validation works understanding with an example.

We discussed in our previous classes.

Dataset is divided into three parts.

1) Training data

2)Testing data

3) Validation data

Training data is used as input to the model.

After training the model using training data.

We use validation data for calculating the accuracy of the model.

If the accuracy is not good. we do some modifications to the model.

Again train the data using training data. and calculate accuracy using validation data.

The problem with this approach is we are wasting validation data.

We are not using the validation data for training.

As we discussed previously. machine-learning algorithms need a lot of data to identify generalization.

To overcome the problem K fold cross-validation takes both training and validation data for training.

If k =5, K fold will divide the entire training data into five parts.

Take the first part for validation and the remaining parts for training.

Train the model and check accuracy on validation data.

Again take the second part for validation and remaining parts for training.

Calculate the accuracy.

Repeat this by taking all the five parts.

The average of all the accuracy is taken as accuracy.

In this model, we are involving all the data for training and validation.

Which K is best?

After a lot of practical usages, the K value is taken between 5 to 10 based on the size of data.

Mostly consider 10 for large datasets.

We should not take very small parts and very large parts.

That's why ten is best in most cases.

Gridsearchcv is a class in sklearn. which we use extensively in machine learning.

All this K fold cross-validation is implemented in that class.

Link for playlists:

Рекомендации по теме
Комментарии
Автор

Thank you SO MUCH for this video!! This is the first resource I've found that explains how k-fold CV compares to the traditional three-way split of training/CV/test. You explain it so well and so clearly. Thanks for your efforts! Can't wait to watch more videos on your channel.

DM-qbjm
Автор

Helped me a lot. Please never stop doing what you do. It is so

sarvinasalohidinova
Автор

please make more videos so that it will be useful during lockdown

ankushpawar
Автор

bro thank you for your video and explanation! I have one question: I have a dataset of more than 200.000 records, and if I want I can extent it, so I'm not limited, does it make sense to use K-Fold Cross validation in such a case?

mmm-mekk
Автор

Hi its a very good video. Could you plz let me know if cross validation is done on train data or total data?

sivakumarprasadchebiyyam
Автор

Thank you for your explanation. But I have a question. At the end, we have 5 trained models. So on which model, should we evaluate our test data? On the best model? ( for example If i reached best accuracy in 3rd fold, should I use 3rd model for testing?)

omerarslan
Автор

hi thank you for this video, can you offer a reference to solve questions about machine learning, I extremely need to a question pack for machine learning in university

shokintouch
Автор

Why are we taking the average of all the validations. Why don't we choose the V with high accuracy for testing?

BilalAhmad-cnme
Автор

I have used k fold cross validation with decision tree for classification (k=10). Training data =80% and testing data is 20%. But I'm getting different accuracy while running each time for each ten folds and I cannot take avrrage accuracy.What may be the problem? Plz help me.

deepthiraosonitha
Автор

i know accuracy is (accuracy = TP +TN / TP + TN +FP + FN) but how to calculate in this case ( i, e 86%, 88%, 91% etc) plzz

pFazalSultan
Автор

How to calculate accuracy pls reply ..

pujithagaddale