Creating Our K Nearest Neighbors Algorithm - Practical Machine Learning with Python p.16

preview_player
Показать описание
Now that we understand the intuition behind how we calculate the distance/proximity between feature sets, we're ready to begin building our own version of K Nearest Neighbors in code from scatch.

Рекомендации по теме
Комментарии
Автор

You always manage to make me laugh while learning coding. Thank you :)

Dutchtraordinary_Living
Автор

"Ok now i'm gonna write one line loop" and all non-pythoners are like :SSSS 0.0

intothelight
Автор

you can just use np.array(dataset['k'])[:, 0] to go through all the x coordinates (column 0) of the given dataset, and use the same except [:, 1] for y values.

TlnITA
Автор

Hi @sentdex,

Could you help me understanding why do we have (len(data) >= k) ?
We have that k means the number of neighbors and len(data) means the number of classes, k is used to check the the distance to the k nearest neighbors independent of their classes, and after that we check the most common class in those k nearest neighbors. Why are you comparing k with the number of classes ? For example: If you have 5 classes, and 1000 points in the dataset, and I want to classify a new point using only the 3 nearest neighbors this would fail, or even the special case 1-NN also would fail. Shouldn't K be compared with the whole number of data point, in my example 1000 ? Like here:

def kNearestNeighbors(data, point, k=3):
if (k > sum(len(v) for v in data.itervalues())):
warnings.warn('K is set to a value greater than the total points!');

Thank you in advance,
and your videos are amazing! You're the best

pedrosantos
Автор

Thank you so much for all these videos!

mikeg
Автор

We could also use plt.scatter(*new_features) instead of plt.scatter(new_features[0], new_features[1]). It saves space and it works for any number of features.

sayyoryusupov
Автор

What do you use to interact with the graph? Thanks! Love your videos.

brianweymouth
Автор

Hi! great video man, thanks :D
Question, how do you autocomplete the parentheses and brackets?

Thepando
Автор

To anyone watching this getting an error with matplotlib, there is a workaround available if you search for it. However, this work around is not necessary on the next tutorial and onward. I spent a good 45 minutes on the problem and gained some knowledge about the matplotlib build and a current glitch. If you are not interested in this info, I would suggest skipping ahead. Cheers!

j.hanleysmith
Автор

stop scrolling through the comments and listen to the guy

vishuvashishtha
Автор

Even I am not convinced why len(data)>= k. I really don't get it, I feel it should be the other way round k<=len(data). Please correct me if I am wrong and explain me why in the video it is len(data)>=k. If the explanation is simple with an analogy, that will be great for all the users who has the same question running back of there mind.

niteshkumarn
Автор

Thanks for breaking down the formulas.

MrBigmit
Автор

hey sentdex i would also recommend an interview preparation series related to data science and ML . keep coming up wid these videos . cheers!

ravitanwar
Автор

im really enjoying your videos bro...you made my day, that was gonna be a nice warning

t.h
Автор

hi and thanks for sharing this how can we access to these codes?

omidasadi
Автор

I got the import error message that cannot import name 'counter'

kiranpradhan
Автор

The one liner can be re-written as
for i in dataset:
for ii in dataset[i]:
plt.scatter(x = ii[0], y = ii[1], color=i)

But I tried the same code with sns.scatterplot it returns the following error:
If using all scalar values, you must pass an index

Can't figure how to resolve this issue. Anyone come across that?
To be clear that has nothing to do with the code being a one liner or not. Anyway, currently can't do the same plot using sns.scatterplot???

GoredGored
Автор

Dear Sir,
Thank you so much for sharing valuable knowledge.
Please help me to solve following error
ValueError: 'color' kwarg must be an mpl color spec or sequence of color specs.
For a sequence of values to be color-mapped, use the 'c' kwarg instead.

shubhamborghare
Автор

what is the effect of using
style.use('fivethirtyeight')

muhammaduzair
Автор

@sentdex Can someone please explain to me the significance of 'ii" in the for loop?
for i in dataset:
for ii in dataset[i]:
plt.scatter(ii[0], ii[1], s=100, color=i)

CaptainAdd