Machine Learning | Cross Validation | Random State in Train Test Split | ML | AI

Показать описание

Machine Learning | Cross Validation | Random State in Train Test Split | ML | AI

Topic to be Covered - Importance of Random State in Train Test Split

Table of Content
0:00 Introduction
00:14 Import pandas library
00:17 Import dataset using pandas read_csv function
00:39 Handle missing values
00:59 Extract features and labels
01:15 Label Encoding
01:37 Sampling - Train Test Split
02:10 Random State
03:20 Compare X_train value with the previous run when random_state remains the same
04:00 Change the value of random_state from 0 to 1
04:20 Compare X_train value with the previous run when random_state is changed from 0 to 1
06:29 random_state=None
07:28 Compare X_train value with the previous run when random_state=None

Code Start here
=============
import pandas as pd
import numpy as np

'''Get the rows that contains NULL (NaN)'''

'''Fill the NaN values for Occupation, Emplyment Status and Employement Type'''

col = ['Occupation','Employment Status','Employement Type']

df['Age'].fillna(df['Age'].mean(),inplace=True)
df['Salary'].fillna(df['Salary'].mean(),inplace=True)

'''col1 = ['Age','Salary']

'''------------------------------- L A B E L E N C O D I N G ------------------'''

encode = LabelEncoder()

'''S A M P L I N G'''

X_train2, X_test2, y_train2, y_test2 = train_test_split(features,
labels,
test_size=.25,
random_state=None)

All Playlist of this youtube channel
====================================

1. Data Preprocessing in Machine Learning

2. Confusion Matrix in Machine Learning, ML, AI

3. Anaconda, Python Installation, Spyder, Jupyter Notebook, PyCharm, Graphviz

4. Cross Validation, Sampling, train test split in Machine Learning

5. Drop and Delete Operations in Python Pandas

6. Matrices and Vectors with python

7. Detect Outliers in Machine Learning

8. Time Series preprocessing in Machine Learning

9. Handling Missing Values in Machine Learning

10. Dummy Encoding in Machine Learning

11. Data Visualisation with Python, Seaborn, Matplotlib

12. Feature Scaling in Machine Learning

13. Python 3 basics for Beginner

14. Statistics with Python

15. Sklearn Scikit Learn Machine Learning

16. Python Pandas Dataframe Operations

17. Linear Regression, Supervised Machine Learning

18 Interiew Questions on Machine Learning and Data Science

19. Jupyter Notebook Operations

Рекомендации по теме

Комментарии

Thanks uploading, can you please upload Algorthem wise ex Decisio tree, KNN____(Your expalanation is very good)

venkataraokallagunta

Hi All,
Please note the following:
"from sklearn.cross_validation import train_test_split" is OBSOLETE now.

Please use the following to import train_test_split
from sklearn.model_selection import train_test_split

technologyCult

Sorry if I get it wrong but you dont need to use numpy or remove the labels from columns to use train_test_split?
I am doing the same thing, open a dataframe from pandas, spliting it in a x just using and y = df['column_I_need'] (I dont need to preprocessing my dataset because it has only numeric data/not NaN or Strings)
As I see in your video I am doing the same thing as you do and my results are pretty nice but I still not sure about this method because mostly people open and use Numpy to generate this x_train, x_test, y_train and y_test

joswrezende

suppose in a model, with a random_state 19, I am getting greater accuracy. So should I stick on to that random state, ie should I deploy the model with that random_state? or should my model perform well with all other random_state?

antonyjoy

what is the difference b/w random_state = 1 and random_state = 12 (Or any other number)

piyushjain

Im doing a decision tree model in python. And I set the random_state to some "fixed" number. But everytime I run the code (randome_state is fixed), I'll get different version of the model. Why is that??

edgarpanganiban

What is the use of random_state if we are going to shuffle the data with shuffle= true parameter

nishadt

Hi.. how to determine the random state for a datset

madhurjyadeka

So basically random_state must be an fixed integer not None. Am i understanding it right?

akanshmishra

Machine Learning | Cross Validation | Random State in Train Test Split | ML | AI

Machine Learning Fundamentals: Cross Validation

K-Fold Cross Validation - Intro to Machine Learning

Machine Learning Tutorial Python 12 - K Fold Cross Validation

Complete Guide to Cross Validation

Machine Learning and Cross-Validation

K-Fold Cross Validation, Stratified K-Fold, Leave-one-out Leave-P-Out Cross Validation Mahesh Huddar

Cross Validation In Machine Learning | Cross Validation | Machine Learning Tutorial | Simplilearn

Cross Validation

Day 20 | What is Cross-Validation in Machine Learning? @Nxtivia #100dayschallenge #datascience

Cross Validation in Machine Learning

Cross Validation in Machine Learning with Examples

What is Cross Validation and its types?

Cross-Validation In Machine Learning | ML Fundamentals | Machine Learning Tutorial | Edureka

Cross Validation : Data Science Concepts

K-Fold Cross Validation - Intro to Machine Learning

Validación Cruzada (Cross Validation) en Machine Learning

Cross Validation

Was ist die Cross Validation? - Machine Learning Basics

Making Cross Validation Simple|What,why and types of Cross validation

K Fold Cross Validation | Cross Validation in Machine Learning

Cross-validation in Machine Learning

Cross Validation

298 - What is k fold cross validation?

Cross Validation 🔥