Linear Regression Machine Learning (tutorial)

preview_player
Показать описание
I'll perform linear regression from scratch in Python using a method called 'Gradient Descent' to determine the relationship between student test scores & amount of hours studied. This will be about 50 lines of code and I'll deep dive into the math behind this.

Code for this video:

Please subscribe! And like. And comment. That's what keeps me going. And yes, this video is apart of my 'Intro to Deep Learning series'

More learning resources:

Join us in the Wizards Slack Channel:

Please support me on Patreon:
Follow me:
Signup for my newsletter for exciting updates in the field of AI:
Рекомендации по теме
Комментарии
Автор

I think Siraj did a good job introducing the idea of gradient descent method, and I am really grateful for all the content he has uploaded to his channel. yet I have some thoughts to add to the current video.
1: fitting a line is not a good example to actually using linear regression. (Altough it is easy to demonstrate the basics with a line fit, yet I think the limitations should be mentioned as well.) The optimal values of m and b can be actually calculated with way less effort than doing a linear regression. Even the error of the fitted parameters can be calculated using simple algebra.
2:The gradient - by definition - points in the direction of increase, this is why you need minus signs when you are looking for the direction in which the error decreases.
3: As a physicist, I advise everybody to check if the results (even of less significant calculations) actually make sense, (instead of memorizing formulas in general). When the program was executed, Siraj concluded that it found the ideal parameters. Actually he had a typoo in the code so he found the worst parameters. If he actually did have a look at the output, he could easily conclude that something was wrong. The error was way too high and the "optimal" value for m was negative. That means - in this example - that the more you study for an exam, the worse grade you get. Which clearly makes no sense.
4: There are indeed strict rules how to compute the partial derivatives. If you don't know them, you can actually use websites to calculate partial derivatives of functions. Have a look at wolframalpha.com for example. Yet, you can calculate the partial derivatives only if you know the actual error function (in the function of the searched parameters). Sometimes you don't even know that so you have to use other methods (or estimate the derivatives).
5: Using visual examples is very beneficial in my opinion, I liked that Siraj presented the problem of finding the minimum using the ball in a bowl analogy - in which the ball minimizes its potential energy. This example would have also made it clear why using a high learning rate is bad: even if you know which way you want to move with your parameters (the position of a ball in the bowl), making a too large step towards that direction can end up increasing the error (e.g. you start to climb on to the opposite wall of the bowl).

nevtelenlajhar
Автор

[8:24] After the epic rap, lecture begins.

WilsonMar
Автор

I came here after the first two weeks of Andrew Ng's machine learning course. It's soo cool to see something you have been learning about for two weeks to happen. Can't wait to implement it myself with my own data sets.

FsimulatorX
Автор

GUYS HIT LIKE FOR SIRAJ..
Siraj you are the best man teaching ML, AI n stuff practically I have come across...Please do not stop at any point no matter what ...You are inspiration for people trying to learn these things ..
Hey siraj if you read this please reply comment !

collegeguide
Автор

This was a really helpful video!! Knowing how to implement gradient descent from scratch is one of the most fundamental things for neural nets.

aishwaryadash
Автор

Super grateful for these videos. It's much better than reading page upon page!

SamSilvercoin
Автор

The analogy with the bowl is really good. I was explained it with a hill and moving a step downwards wich is more accurate but harder to grasp

logicaldistraction
Автор

Thanks for that first session, really enjoyed it. I implemented it myself and played around with both yours and mine implementation. I made the following observation:
If you set the learning rate to 0.0005 or higher you run into an overflow in the compute_error... function, which leads to wrong results and an inf error.

SirKober
Автор

I may be wrong but I think a minus sign is missing in line 45. This was a great lecture. Thank you :)

Автор

Dude you are awesome mad impulsive person. You are totally in depth with what ever you are doing on that moment. I love that.

sankhadeepsarkar
Автор

Amazing. I just finished watching Stanford's Gradient Descent video but yours was more elaborative and cleared my doubts. :) I have a clear and better idea about Gradient Descent now. :D Thanks very much. :)

svsanketverma
Автор

Hi Siraj, got to know about your channel through Udacity's Deep learning foundations and have been hooked ever since I have arrived here. Your way of understanding and teaching is so very unique and you show the excitement that we ourselves feel when learning all of this stuff. I particularly loved (being a Math geek) your video on how Maths is everywhere in our world. I would have loved to be officially a part of this course, but couldn't sign up for it. But I would surely be 'auditing' the course (one of the great perks of studying from Udacity being that the content is available for free) and am really excited to learn everything that you have in store for us :)

amandalmia
Автор

I made my own linear regression algorithm, and found that just calculating the partial derivative by the slope between two very close points worked well. I'm not sure exactly what your equation does, but doesn't it use the same logic? After all, we don't have the literal equation to take the derivative of.

matthewdaly
Автор

Siraj, First of all this is dope, Amazed how you do it ?and What all things you have gone through for it .Finally, I am really great fan of your work man keep making such great content.

ranaify
Автор

You're a great teacher man! I'm starting to do ML through c++ and love your videos to learn theory with an example. I can actually understand what is happening and apply it to languages other than python now knowing what is happening behind the math. Thank you sir.

ToaOfTech
Автор

Siraj, thanks for this series it is great to revise material I haven't used for some time. I love your passion for this subject, you could power a small town off your enthusiasm. I never remember any lecturers getting so excited about any topics like this (# magic, the greatest the greatest).

I understand you are limited to only a ~50 min live stream so you have to error trap on the fly in real time (no pressure). For that reason I wouldn't criticize your code errors, besides the Github code is usually better written. none the less I like typing along with you as there is no better way to learn than when an error pops up.

Thanks again for the resource links and all your efforts. :D

xxXXCarbonXXxx
Автор

as Fabian Becker said,
Please don't say you've found the optimal m/b. You could be hitting local minima or simply not have done enough iterations. Gradient descent is very vulnerable to fitness landscapes that are non-linear.
is there any algo that can help me reaching the global optima

divyanshkanchansaxena
Автор

Thank you for this amazing tutorial I never thought I could learn calculus and such a easy way

DionBoles
Автор

Hi Siraj, this was amazing, thank you!

I have a doubt about partial derivatives. At 35:30, the formula doesn't include the summation of 2/N, right? (Since its before the Sigma notation)

But at 37:42, you seem to have summed 2/N in both. So is there something I'm missing?

azklinok
Автор

Hi Siraj, lovin your videos so far. Very informative, direct to the point and fun to watch.

I'd like to ask something though which I'm having hard time composing the right question so I tried to rephrase it 3x, I hope you get what I'm asking.
1. What did we prove for finding the line of best fit for the data in the demo?
2. I mean how would you explain the result of the training process?
3. How do you explain the relationship of amount of hour study vs the test score?

That is something that is not clear to me.

dharealmusic