One-Hot, Label, Target and K-Fold Target Encoding, Clearly Explained!!!

Показать описание

In theory, discrete variables, or features, are easy to use with machine learning algorithms. However, in practice, it's not always so easy and we often have to transform discrete values, like favorite colors, into numbers. There are lots of ways to do this, and this video walks you through 3 of the most popular methods.

English

Spanish

Portuguese

If you'd like to support StatQuest, please consider...
...or...

...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...

...or just donating to StatQuest!

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

0:00 Awesome song and introduction
1:24 One-Hot Encoding
3:25 Label Encoding
4:39 Target Encoding
6:27 Target Encoding with a Weighted Mean, or Bayesian Target Encoding
9:56 K-Fold Target Encoding

#StatQuest #DubbedWithAloud

Рекомендации по теме

Комментарии

It's mind-boggling how much better Josh is at explaining complicated topics than anyone else.

matthewmechtly

This is literally the best explanation about statistics & traditional ML model. I am so lucky to see your video with my journey of data science started.

sungminson

Thank you so much for your videos, this is by far the best educational Machine Learning channel I’ve ever come across

bridgetelly

Much better explanation than what I had at class!

Li-dvlr

Hi Josh I wanted to thank you for your content, I'm finishing your stats playlist it's very good. Statsquatch has become my friend. Big hug straight from Brazil!

LuizHenrique-qrlt

I really loved your explanation and your sense of humor. I really did!

moazhendy

Hi Josh. Just came across your channel. Your method of explaining is so concise, clear and appealing. Definitely I would learn a lot from this channel.

myfoodfeast

thank you your explanations are always simple and clear.

amirrezaabedini

Totally great explanation, congratulations

MilenkoCurcin

Love the dry humour in your videos 🤣. Great content too!

marcom

Hey Josh, great job. Thnak you a lot!

hasandaaboul

How do you use k-fold target encoding for a test data set, since blue now has several distinct numeric values as a predictor in the training set?

monkeystoot

but what happens in inferring? say you trained a great model and now you are predicting the new data, do you use the mean of the old data or the mean of the new data? if you use target encoding, well in the new data you don't have a target? so what now?!?

SnipeSniperNEW

Great video. 👍🏽
I find it less confusing, however, to say categorical or qualitative data instead of discrete data.
Numeric data can be discrete (integers)

lbognini

Hey Josh,

Love the videos. I'm left with one question: is there anything we can do when we are doing multiclass classification and need to transform our predicted variable so that the algorithm isn't working with string data?

bkleinman

Great work like always! What to do, when target encoding results in the same number for two labels?

lutzsommer

Hi Josh, a heartfull Thank you for sharing these encoding techniques

I have one doubt; it may look stupid, but I just want to clarify it with you.
On 13:40, the encoding of green colour with target value 1 is 0.42, and below that, green colour with target value 1 is 0.67.
So when encoding transforms the new data, will system change the green colour to 0.42 or 0.67?

lakshmanbharath

Hi Josh, thanks for explanation. But I want to know how to transform the unseen data using k-fold target encoding? Is it oke to use mean value of the transformed category? Thanks before

dvergnordicalfar

thanks for a great video! i am trying to apply k-fold target encoding on my train and test data. i target encoded my train data using k-fold target encoding just like the video, but how should i encode my test data ? If the feature is BLUE, should i get the mean of BLUE (target encoded) in the train data and use it for test data? OR should i just use the whole train data to get new target encoding values for the test data?

speedtent

Hi Josh! Great video. Are you planning to add to these videos how to apply them in Python?
Thanks!

lautarocisterna

One-Hot, Label, Target and K-Fold Target Encoding, Clearly Explained!!!

One-Hot, Label, Target and K-Fold Target Encoding, Clearly Explained!!!

Quick explanation: One-hot encoding

When to use One-Hot , Label and Ordinal Encoding in Machine Learning | Feature Encoding Tutorial 4

One Hot Encoder with Python Machine Learning (Scikit-Learn)

What is One Hot Encoding

Different Types of Feature Engineering Encoding Techniques

Encoding categorical data in Python | Target Encoding technique | Python Tutorial

CatBoost Part 1: Ordered Target Encoding

What is one-hot encoding?

What is the difference between one hot encoding vs label encoding | #interviewquestions #datascience

Data Preprocessing 06: One Hot Encoding python | Scikit Learn | Machine Learning

how to one hot encode our target in TensorFlow to categorical labelencoder

Featuring Engineering- Handle Categorical Features Many Categories(Count/Frequency Encoding)

One Hot Encoding Vs Label Encoding Explained with Example in Hindi l Machine Learning Course

Machine Learning Tutorial Python 12 - K Fold Cross Validation

Do you want to better your life? #philippines #angelescity #expat #pampanga #travelvlog

Categorical Data Encoding Methods: Label, One Hot, Ordinal, Frequency, Target Encoding

He made a trick in the atm #shorts

Live Stream - Target Encoding/AMA/Silly Songs!!!

Variable Encodings for Machine Learning | Categorical, One-Hot, Dummy, Ordinal | ML Fundamentals 4

Machine learning feature engineering: Label encoding Vs One-Hot encoding (using Scikit-learn)

Neural Networks Explained - Part 3: One Hot Encoding

Why do we split data into train test and validation sets?

One Hot Encoding for Machine Learning & Statistics | Nominal & Categorical Encoding #shorts