StatQuest: Random Forests Part 2: Missing data and clustering

preview_player
Показать описание
NOTE: This StatQuest is the updated version of the original Random Forests Part 2 and includes two minor corrections.

Last time we talked about how to create, use and evaluate random forests. Now it's time to see how they can deal with missing data and how they can be used to cluster samples, even when the data comes from all kinds of crazy sources.

For a complete index of all the StatQuest videos, check out:

If you'd like to support StatQuest, please consider...

Buying The StatQuest Illustrated Guide to Machine Learning!!!

...or...

...a cool StatQuest t-shirt or sweatshirt:

...buying one or two of my songs (or go large and get a whole album!)

...or just donating to StatQuest!

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

#statquest #randomforest
Рекомендации по теме
Комментарии
Автор

NOTE: This StatQuest is the updated version of the original Random Forests Part 2 and includes two minor corrections.

statquest
Автор

Initially I feels the uklele sound is awkward, now after 3 days of aggressive learning, that sound make me relief so much.
Thank you Josh.

syle
Автор

Thank you so much for the amazing videos Josh, I don't know what I would have done without those videos in my Data Science journey!

qusaibasem
Автор

I love these videos because when you get to a concept I don't fully understand, you follow up with.. "check out the statquest for it"... I started with gradient boosting but paused the video and have been detouring for an hour now covering your pre-req videos, and including a couple pre-req's to pre-req's.

An hour in, sipping on a beer, and I can feel myself getting smarter. A huge advantage to the way you do your videos is that I don't have to pace myself to learn only a concept, or part of a concept, a day. I can binge and stay engaged. Great stuff.

williamrinauto
Автор

I just loved the video. The way you explained what is proximity matrix and how we can calculate distance matrix from it was the best part. None of the websites explained that part. Thanks for making this useful video. You totally nailed it!

shannatheragamuffin..
Автор

Thanks a lot for these gems! Have an interview coming up and needed a refresher!

RaviPrakash-dzfm
Автор

This idea is amazing!! Never thought of RF being used for clustering.. just amazing!!

rajarajeshwaripremkumar
Автор

These Stat Quests are building my life! I really would like to advice our college professors to learn from here and then teach in college!! But i can't do that XD, so instead, I advised all the students to learn from here! all of them love it!
Thanks Josh!!

ahanadrall
Автор

I just came across your videos and love them! They explain stats in such an intuitive way. They provide a perfect overview that makes it so much easier to digest formulas and code later on. Triple Bam, thumbs up and a biiig thank you!

nicolegroene
Автор

I really appreciate all of your videos.. I am surviving this semester with your awesome, kind, amazing videos. :)

Hien
Автор

Thank you very much for this video! It was fun to watch and I learnt a lot from the step-by-step process of adding in missing data!

tymothylim
Автор

Hello Sir, You made a brave attempt in explaining this topic in a simple manner but honestly speaking this has gone above my head. I need practice a lot before I enter this area.

mohitgu
Автор

Laughing and learning ? That's how it's supposed to be, cheers to you man !

saadci
Автор

RFs are simply amazing. They can predict well, are a valuable variable selection tool, and now you are telling me they can produce a similarity matrix too!

murilopalomosebilla
Автор

Dear Josh, I just purchased your 'Illustrated Guide to Machine Learning' and I wanted to tell you 2 things:
1 - It really is amazing and the content is explained very well, visually - which is essential. Thank you
2 - I was, though, a tad bit disappointed to not find Random Forests, XGBoost, PCA, LDA, etc. in it. But I get it - it's only a $20 book. However, I wanted to ask if you would/could release another book containing these slightly advanced aspects which you did not cover in the Illustrated Guide? Please let me know. Looking forward to hearing from you.

Gautam
Автор

Super great video. So much info in a concise and effective manner. Just FYI, no big deal, at 1:55, is 167.5 the mean as opposed to median?

Raven-bixn
Автор

Your video hypes me up ! I'll try all those tricks this spring break 👍

adamdeuxieme
Автор

Did not know that random forests can help in missing value imputation. Thank you 👍

ksiddarthadshetty
Автор

Dude! You are better than my professor who taught me the RF three years ago!

szco
Автор

Thanks for sharing, with a real talent for explaining

davemartin