Scikit Learn Linear SVC Example Machine Learning Tutorial with Python p. 11

preview_player
Показать описание
In this sklearn with Python for machine learning tutorial, we cover how to do a basic linear SVC example with scikit-learn.

Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6
Рекомендации по теме
Комментарии
Автор

C is just 1/lambda in regards to the SVM cost function. It's a function of the regularization term. A higher C value will reduce bias and increase variance.

Joe-cszk
Автор

I love all your tutorials, so, can you upload a video of your dog?

tongwu
Автор

BE CAREFUL !! THERE IS A TIGER RIGHT BEHIND YOU !! 

Anyway, great stuf. Thank you for your time and dedication.

alexandrehubert
Автор

The capital X (I think) is used by convention to indicate an array or matrix. When its a simple linear model: y = mx + b for the vector x. In the case of multiple predictors stats guys use y = X*(\beta) + (\epsilon) for a data matrix X.

Sobewan
Автор

Wow... This was very helpful. Like you I am self taught in python and only now learning machine learning.  I would like to leverage you experience in learning scikit-learn specially for images.  I have been looking for information on importing Histogram Oriented Gradients into scikit-learn. As you say the hardest part of machine learning is formatting the data to put into the svm.SVC.  I have a web camera where I have collected hundreds of images of sedans, vans, trucks, SUV that pass in by my house.  I have cropped these images and classified them as such.  I think I need to use this command to get the HOG features into a format that can be used by scikit learn    
>> fd, hog_image = hog(greyscale, orientations=8, pixels_per_cell=(16, 16),
                    cells_per_block=(1, 1), visualise=True)
where X values are fd (x, y pixel values for each HOG image).  I am not sure how to compile all of the images into a single np.array.
My labels (y values) I guess would be for Sedan would be [1, 0, 0, 0], for van [0, 1, 0, 0]
would you agree?  It would be great if you could do a video on this topic.  What you have done here appears to be applicable in a simple way. Anyway.  I would be grateful for your thoughts on this.

Moment_Captured
Автор

my favourite part is when your dog shook

clared
Автор

Using Get an error when trying to use clf.predict([0.58, 0.76]). The error is Expected 2D array, got 1D array instead:
array=[ 0.58 0.76].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Reshape doesn't seem to help.

ilarums
Автор

Hey +sentdex, thanks for the awesome video.

I haven't watched your whole series yet (I'm on part 15), but this video was the one that made me closer to understanding how to solve my particular problem, but I'm not quite there yet. Would you mind giving me some hints?

If so, let me explain my situation:

I have an e-commerce catalog with product title (e.g 'iPhone 6', 'Smart TV DR-7874 with Wi-fi') and their respective categories (e.g 'Smartphones' and 'TV'). I need to predict in which category an uncategorized product fits best. Thats how I'm imagining I would predict category "c" of a new product "p"...

The goal is doing something like:
predict('Nokia Lumia 630/635 Anti-glare Screen Protector') # > Smartphones

That's the steps I would make:

1. Selecting my candidates, my dataset would be comprehended only of products that share at least one word with "p" (maybe using some tf-idf to filter useless words)
2. Create a list of all words (bag of words) from the selected candidates.
3. Each word could be a feature, 0 would mean that this feature was missing in one particular product, and 1 the feature was present. This step I'm kind lost, not sure if that makes sense. (maybe an inverted index word-product might help doing these look ups?
4. I'm guessing X would be comprehended of a list of lists, each inner list corresponds to a particular set of features.
5. Labeling the data, y would be the categories of the items (this comes from my database)

Sorry for the bad english and long post. Some steps I don't know if are going to be necessary, and after 3 I'm just trying to imagine how to get this done, not sure if the approach is correct.

lucaspelegrino
Автор

Very cool and useful video series so far, really enjoy it. I have quick question: How do you do the multi-line commenting @12:47?

DeadWalker
Автор

Very nice video, Sentdex Indicator , have you made video like this for other supervise alogorithm like decision tree or random forest.? here you have not talked about what kind of real feature can be x and y. and is not it imp to show support vector as well while making classification

uniqueraj
Автор

hey!
how do we plot a divider in the case of multiple features in a range of hundreds and thousands?

hanithadevigundabattula
Автор

why am i getting this error?


a = -w[0] / w[1]
IndexError: index 1 is out of bounds for axis 0 with size 1

sudarshanrbhat