Should You Scale Your Data ??? : Data Science Concepts

preview_player
Показать описание
When should you scale your data ???

Рекомендации по теме
Комментарии
Автор

All your videos which I've seen are super clear, thanks.
I'd add that scaling will affect results of linear regression if we use regularization: if we have two features x1 and x2 which are equally important and are measured in centimetres, then weights w1 and w2 are expected to be equal as well. If x2 is measured in metres than weight w2 would be 100 times higher, and if we have regularization it would penalize the large weight so we may obtain another result.
One more case when it is useful to use scaling is when we use gradient descent for optimization, scaling data would make gradient descent find optimum faster.

greenbean
Автор

A fixed interval value such as RGB color, PH we do normalize
A non-bounded value such as temperature, dimension use standardize
One reason why we Scale data? because it is sensitive to nearly all regularizer

ccuuttww
Автор

Thank you for such clear explanation of concepts. Love your teaching skills

sachinrathi
Автор

How would you recommend to handle categorical variables - OHE or Impact encoded? Would scaling make sense? Say in NN where scaling will speed up things.

Nishant
Автор

I have a doubt. Please help me understand this: If all my features are not of same unit, say one input variable is daily step count and another is daily average touch events and so on, then does scaling make sense?
Or another example, say i have two input variables, one is distance in km and another is weight in grams. Would scaling be the right approach? I don't think so.

alihindustani
Автор

Please use YAML for ML and do a video.

JainmiahSk
Автор

Uniformly scaling data for KNN is not always (or even often) correct. It can be often wrong to say every feature is equally important.

kdhlkjhdlk
Автор

By shifting(xi-mü) and scaling((xi-mü)/sigma); which is zscore; a dataset, one loses information of number value of each element of dataset, in return, who gains capability of attaining descriptive characteristics of dataset to infer.
Is this true?

sukursukur