Python - machine learning #2 - test-train split - md5 hash - coding for all

preview_player
Показать описание
Splitting a Python pandas dataframe into test and train sets for machine learning purposes using md5 hash when the data set is updated at a later time.

Ref: Hands-on machine learning with scikit-learn and tensor flow by Aurelien Geron
Рекомендации по теме
Комментарии
Автор

thanks for the detailed explanation of oreily ...I was scratching my head

vishnusaitejanagabandi
Автор

Hi
How is probability 1/256?
For a given index value, say 34, would the hashing be different everytime I run the hash-digest function?

sakshamarora
Автор

hello brother,
In hashlib md5, instead of 256 we should use len(df) or 256 is fine?

yatinbansal