Writing our own K Nearest Neighbors in Code - Practical Machine Learning Tutorial with Python p.17

preview_player
Показать описание
In the previous tutorial, we began structuring our K Nearest Neighbors example, and here we're going to finish it. The idea of K nearest neighbors is to just take a "vote" of the closest known data featuresets. Whichever class is closest overall, is the class we assign to the unknown data.

Рекомендации по теме
Комментарии
Автор

"Like" in every video . You deserve. Its content is good and your videos are made with quality!

jeffthom
Автор

FYI, although not important on the dataset used here, if this function were for actual production use you could gain quite a bit of efficiency by only tracking the 3 smallest distances, rather than tracking all of them and then sorting. Tracking all of them uses an additional O(N) memory, and is O(N log N) time for the sort. However, if you were to simply track the 3 smallest distances (i.e. once you build up to 3, only replace if smaller than one of the three), this is O(1) additional space, and O(1) per loop, so O(N) time complexity.

gratephulshred
Автор

when i heard machine learning in high school it sounded scary and difficult yet here i am learning it after i graduated. it just feels crazy

edilgin
Автор

I love how well you explain things, I'm a very slow learner but this video and the previous part really helped me a lot for understanding + coding KNN in Python. Thanks :)

derpfaceonigiri
Автор

This was a really good video. I feel like I understand the K Nearest Neighbors algorithm after writing it out step by step. Thanks!

timharris
Автор

Another efficiency change that you could have made is to not take the square root in the euclidean distance. It's a very expensive function and it's not required at all for comparing distances (since is a strictly growing function). In other words, if we let d1 be one distance and d2 be another and d1 > d2, then d1^2 > d2^2. Thus you don't have to take the square root.

ernesto
Автор

Your teaching feels like andrew N and great quality of content

nagasaikumark
Автор

Nice illustrative example. Just for kicks numpy actually supports broadcasting and has all the functionality you need to implement knn. You can do it in one line when your labels are nonnegative integers in y_train, your features are X_train and your new data point is x_topredict as

- x_topredict)**2,

shaunsawyer
Автор

Curiously, will you be covering artificial neural networks at some point? Would love to see that in this format.

akma
Автор

great video! will look forward to the upcoming ones!

nickr.feller
Автор

Well explained ... Very useful videos..

KuldeepSingh-jgxz
Автор

kinda cheating, and the true way to write it :) I like that sentence a lot, so true to understanding it at the core and correctly putting it across. You are simply great :) Thank you for so much of beautiful content and so clear Sentdex. I was zero in python and ml, now am confident that I understand it correctly and code it correctly, all thanks to you :)

varshajhapillai
Автор

Realized why you used k & r!
k = black for plot (like from CMYK)
r = red

tried changing the group name to 'g' and got green.

Adaministrator
Автор

Sir. Thank you. It is nice work. videos are vary usefully.

pycomvision
Автор

what if the 3rd and 4th values in the top sorted votes are the same value but different groups? how would it choose between the two by default? And do you need to apply some kind of 'default sort' that it falls back onto if this occurs?

LukeBeacon
Автор

Hey, I tried to run the data set against the sklearn classifier and noticed that when test data only receives 'r' group values, then accuracy drops to 0.0 and gives wrong prediction of [5, 7] belonging to 'k' group. Is this something that we should be aware of when using cross_validation with a small set of data or is it an expected behavior?

Any feedback you give on the topic to put some light on it will be appreciated. Your vidoes have been very helpful and are very easy to follow too. I appreciate the work you have done to put this series for all of us.

Thanks.

MrAbIRaZ
Автор

Hello. Thanks for the help! I have a question: I'm doing image classification and my dataset will be histogramms. For each class I will have 8 histogamms let's say. And those histogramms are in files.txt. I don't know how I can code that. Do you have some advice ?
Thank you.

cerinemokhtari
Автор

Hello and thanks for your wonderful tutorial. My simple question is, what if I have a list but not a dictionary as in your case? I am working on clustering polygon data and took centroid points of them to cluster. Thanks for reply in advance.

SaZimjon
Автор

Did he say y'all at 2:26? also love these videos. So much easier to follow than anything else I can find on the internet.

SteepMountainGames
Автор

hey, I was following it till the middle, and then it just jumped, and could no longer understand. as feedback please look at your video from a perspective of a student who is new to machine learning.

murtazajabalpurwala