Encode categorical features using OneHotEncoder or OrdinalEncoder

preview_player
Показать описание
Two common ways to encode categorical features:
- OneHotEncoder for unordered (nominal) data
- OrdinalEncoder for ordered (ordinal) data

P.S. LabelEncoder is for labels, not features!

👉 New tips every TUESDAY and THURSDAY! 👈

=== WANT TO GET BETTER AT MACHINE LEARNING? ===

3) LET'S CONNECT!
Рекомендации по теме
Комментарии
Автор

Thanks for watching! Let me know if you have any questions about OneHotEncoder, OrdinalEncoder, or LabelEncoder! 👇

dataschool
Автор

Thanks, you do a great job explaining those crucial concepts for ml flow anb being concise on this playlist. Great work

-hedredo
Автор

What a great content, thanks for sharing it.
As always something very good.
I'm your fan.

DevMadeEasy
Автор

short and precise !! thank you so much

nesrinehadjamar
Автор

How would you encode data that has an ordering, but is cyclical. An example might be a "Time of Day" feature with entries such as 'morning', 'noon', evening', 'night'. There is an ordering to this, but it's cyclical, so that there is no lowest or highest value.

Thanks! I very much enjoy your explanations and videos.

mathgeek
Автор

Thanks for this educative lectures, please i have a thought, if i have a pipeline with OneHotEncoder and OrdinalEncoder with the same data you used for the practice for example, how will the pipeline know which columns to OneHotEncode and the ones to OrdinalEncode,

Thanks

abdulraheemabdul
Автор

phenomenal presentation as always<3

apostolosmavropoulos
Автор

i'm just wondering about something when do we encode using dummy variables? i mean when is it necessary ?

nesrinehadjamar
Автор

Amazing video, subscribed! just to clarify on the plane problem, id order[third, second, first] to make first class highest rank during training right? Thanks!

TheWisePhotographer
Автор

thanks for the explaination, sire. however i have several questions. does label encoder works aswell for nominal unordered data? how do we find mode for the nominal data where it will shows the pre-encoded data(not to show the numbers)? how do we find mode median(Q1, Q2, Q3) for the ordinal data where it will shows the pre-encoded data(not to show the numbers)? thanks

-arielnicholascaryndrasd
Автор

Great video! I have a dataset with categorical and numerical values, and my quesiton is, should I encode just the categorical values and then rmerge that with the dataset? and then drop the not encoded values?

jorgesisco
Автор

What would be a good way to encode, when there are lots of ordinal features with multiple labels? Is manually defining the categories the only way?

sasidharansathiyamoorthy
Автор

What if I had a column such as a city name and I wanted to give each city a unique ID. Assume the model I am using is the likes of Decision tree or Random Forests. Herein I cannot use Ordinal encoder as there is no specific order and using OneHotEncoder might generate a lot of columns if there are many many unique city names. Can I use Label encoder on that column then?

MayurGarg
Автор

Hi I want to know for an ordinal columns, after the data is split, should i just use transform on validation data (like test data) or should both train and validation data be fit_transform?

getchethanbr
Автор

Hi, so I'm trying to work with data that has two entries: Column X: X1, X2, X3, X4, X1 and column Y: Y3, Y4, Y2, Y1, Y1. Is there a way to get a matrix of shape rows = X1, .., X4 and columns = Y1, .., Y4 (all unique) with 1s where a relationship exists and zero otherwise? For example the first row would be X1: Y1(1), Y2(0), Y3(1), Y4(0), similar for X2, X3 and X4 rows? Thanks

krishln
Автор

I do not understand why OneHotEncoder.fit_transform() took X[['Shape']] as argument rather than X['Shape'].

EDIT: Now I think I understand. Passing a list of columns when subsetting a DataFrame will produce a new DataFrame (rather than a Series) even if the list has length 1.

filosofiadetalhista
Автор

Hi I have a question, do you have to first classify the column to 'categorical' prior to encoding? in some examples I see some examples where column classification is converted and in some examples I do not see it being converted. I performed a test and the results were the same whether I convert it or not. Could you please shed some light.

floyddsouza
Автор

Can you specify differences between Labels and Features....?

VarunKumar-pzsi
Автор

If we use OrdinalEncoder, would the algorithm consider the relationship between predictor and response to be additive?

mikhaeldito
welcome to shbcf.ru