The SoftMax Derivative, Step-by-Step!!!

preview_player
Показать описание
Here's step-by-step guide that shows you how to take the derivatives of the SoftMax function, as used as a final output layer in a Neural Networks.

For a complete index of all the StatQuest videos, check out:

If you'd like to support StatQuest, please consider...

Buying my book, The StatQuest Illustrated Guide to Machine Learning:

...or...

...a cool StatQuest t-shirt or sweatshirt:

...buying one or two of my songs (or go large and get a whole album!)

...or just donating to StatQuest!

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

0:00 Awesome song and introduction
0:57 SoftMax derivative with respect to the output of interest
3:58 SoftMax derivative with respect to other outputs

#StatQuest #NeuralNetworks #SoftMax
Рекомендации по теме
Комментарии
Автор

I don't remember when was the last time that I subscribed to a Youtube channel but you got my subscription and my gratitude is what I can give you in exchange of this magnificent, clear, SHORT and understandble video. THANKS!!!

brianp
Автор

The quotient rule: low-D-high minus high-D-low square the bottom and away we go! My teacher told me this 14 years ago, and I never forgot! Also, thanks for posting! I love these videos!

charlesrambo
Автор

I literally spent 15 minutes trying to figure out this derivative while I put the (original) video on pause. As soon as I pressed to resume, you pointed to this explanation! I now officially consider myself “hard-core” :)

JaviOrman
Автор

Josh, thanks for these great videos, it really helps a lot for me and so many others who like machine learning! I think you make great videos teaching people about the ideas, but I really hope there can be more videos on how to code under these knowledges. I think it will be wonderful if your videos combine coding and theory together.

renjiehu
Автор

Absolutely awesome, I love going ml over with this kind of videos.

romellfudi
Автор

your videos helped me have a better understanding of ml in a simpler way than the literature. thank you and waiting for your next videos.

omarthesolid
Автор

Thank you very much, I was struggling to understand the SoftMax derivative, and finally managed to understand.

davidzarandi
Автор

the step by step derivative explanation is good.

dearbass
Автор

I am taking your NN series online to have that protective bubble of easily understandable memories of knowledge before my professor dive into the real "academic knowledge" lol Thanks for the video

xiaoyangwu
Автор

Excellent Video Brother ❤ Really so addicted to your videos and the way you explain every topics.. Thanks Man ! 🙌

sagarpatel
Автор

Statquest should be declared as Universal Treasure !

sanskarshrivastava
Автор

Great video. I wonder if we will reach the Gaussian processes regression in the quest soon?

mahmoudshehata
Автор

Quotient rule : d/dx(U(x)/V(x)) = U(x)' V(x) - U(x) V'(x) / V(x)^2 , in our case, U is U (x, y, z ) so we use partial derivatives 3 variables : setosa, versicolor and virginica
d( E(x) / E(x)+Constant ) is the first equation we derive. Constant= E(versicolor)+E(virginica ). With clever identifications, Josh has a simple final formulae :)

pperez
Автор

I love you used 'hard core' to describe guys watching this video.

weipenghu
Автор

Of the many AI videos on YouTube, yours are definitely among the very best.

I'm even getting used to your constant singing, all the time. 🤣

Jupiter-Optimus-Maximus
Автор

Excellent Series !!! . I think for Regression use case you can try a Different example. Drug dosage example looks like classification Problem and there we used SSR as loss function. Like iris Flowers Example was perfect for Classification Problem

sachink
Автор

Video is great! But why probability of setosa in commas? Doesn't softmax convert raw outputs in probabilities based on logits relativity?

blagodaren
Автор

Remarkable sense of humor :D Laughing while studying

lasha-georgeds
Автор

Thank you for the derivative!
I want to ask a question at 6:30 that
after we calculated the 0.21/-0.07/-0.15, what is the next step of this network?
I mean at which part will this network utilize 0.21/-0.07/-0.15?
Thank you for reading my question!

nonalcoho
Автор

I'm not hardcore, the coursework is...Thank you for helping me out.

usergoogle