Splitting Training and Test Data for Machine Learning Using Python and Scikit Learn tutorial

preview_player
Показать описание
Welcome to the video series on Introduction to Machine Learning with Scikit Learn and Python. This is Chapter -7 and in this chapter, we will talk about how to judge the performance of our machine learning algorithm.

This is a video series on scikit learn tutorial. In this series I'm talking about using scikit learn machine learning for our implementations

Machine learning Algorithm selection faces a unique catch22 situation where you get the data to train but need unseen(new)data to test the algorithm which is available only with production.

To avoid this situation and understand the performance of the selected Machine Learning algorithm, we need to generate TEST DATASET from the available DATA Set.

We can do the same by segregating the available dataset in Training Data Set and Testing Data Set. Scikit Learn provides a utility function called train_test_split which can help us to achieve this goal

This video explains the usage of train_test_split function and how we can generate training and testing datasets.

#python #Machinelearning #scikitlearn #ArtificialIntelligence #python #softwaredevelopment #programming #pandas #scikitlearn #datascience #dataanalytics

Hi I am Deepak k Gupta (nickname - Daksh and Preferred). This channel is for budding as well as experienced software developers who are willing to explore the awesome world of programming.

Here is the brief list of things which you can find in my Youtube channel

1. C++ programming (latest specification C++17 and C++20 ), create high performance system applications using this one.
2. Create microservices designed for multiple CPU cores using my golang tutorial
3. Create web applications as well as backend application using my Javascript tutorial and node js
4. Create cross platform mobile apps using my flutter tutorial
5. Learn Python Programming, the language in demand and learn to do effective ways of doing Data Science and Machine Learning. My python tutorials includes but not limited to supervised and unsupervised learning, logistic regression, gradient descent. You will also be able to create neural networks using my Pytorch Tutorial
6. Learn source control with my git tutorial, which is one of the most widely used decentralized source control. Learn how to create branch using git branch, merge changes using git merge, checkout a branch using git checkout and commit your changes using git commit
7. Learn about persistent nosql databases like mongodb using my mongodb tutorial as well as in memory nosql databases like redis using my redis tutorial. you'll also learn about using redis nodejs
8. Understand the concept of handling large data using my big data tutorial and using databases like apache spark
9. Learn about graph theory and graph database and how to make use of graph databases like neo4j
Рекомендации по теме
Комментарии
Автор

Wow this is a great video! Thank you so much! I like the step-by-step explanation of the parameters of the function too!

tymothylim
Автор

x_train is the training data, y_train is the labels for training data, similarly, x_test is the test data and y_test is the labels for test data

santoshkumarthapa
Автор

Thanks a lot for the clear explanation.

dhritisundar
Автор

why I make training and testing on data by I split the data.? must be make a test on data which is already trained without split a data into two set for test and train because it will be from the same part. do I right?

mohammedkareem
Автор

how can i split train and test dataset for speech signal (other language)?

raimaadhikary
Автор

Hello thank you for the video It was very clear to understand!!
I need a little help, I've created recommender algorithù and I don't know how to evaluate it, I've seen many people evaluate their models using predefined recommender algorithms from libraries but not with their own algorithms, I'd appreciate it if you can help, Thank you!

Sprakintov
Автор

How to split own image dataset with xtrain n ytrain

PadminiMansingh
Автор

When i train model then how to test with different dataset. I have 2 dataset one for train and other for test.

hasnain-khan
Автор

Hello sir, can you please tell me how to evaluate final data on test dataset?

puspitachatterjee
Автор

Sir, How to split the data in loan prediction datasets
What parameters we can use?
In train dataset the parameters are- Loan Id, Gender, Married, Dependents, Education, Self Employed, ApplicantIncome, CoapplicantIncome, LoanAmount, LoanAmountTerm, Credit History, Property Area, Loan Status

X_train, X_test, y_ train, y_test=train_test_split()

Plz help me

shamli
Автор

sir, if we have images data then?
how can we split it

nooribrahim