The matrix math behind transformer neural networks, one step at a time!!!

preview_player
Показать описание
Transformers, the neural network architecture behind ChatGPT, do a lot of math. However, this math can be done quickly using matrix math because GPUs are optimized for it. Matrix math is also used when we code neural networks, so learning how ChatGPT does it will help you code your own. Thus, in this video, we go through the math one step at a time and explain what each step does so that you can use it on your own with confidence.

NOTE: This StatQuest assumes that you are already familiar with:

If you'd like to support StatQuest, please consider...
...or...

...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...

...or just donating to StatQuest!
venmo: @JoshStarmer

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

0:00 Awesome song and introduction
1:43 Word Embedding
3:37 Position Encoding
4:28 Self Attention
12:09 Residual Connections
13:08 Decoder Word Embedding and Position Encoding
15:33 Masked Self Attention
20:18 Encoder-Decoder Attention
21:31 Fully Connected Layer
22:16 SoftMax

#StatQuest #Transformer #ChatGPT
Рекомендации по теме
Комментарии
Автор

Josh Starmer is the GOAT. Literally every morning I wake up with some statquest, and it really helps me get ready for my statistics classes for the day. Thank you Josh!

samglick
Автор

so happy i reached here..passing through all complicated topics and now just few topics away from completion..it's all your dedication to teaching..thank you

navneettiwari
Автор

Very educational, and also innovative in the way of doing it. I have never seen such teaching elsewhere. You are the BEST !

NJCLM
Автор

You weren't kidding, it's here! You're a man of your word and a man of the people.

jpfdjsldfji
Автор

As an electronics hobbyist/student from way back in the 70s I like to keep up as best I can with technology. I'm really glad I don't have to remember all the details in this series. There are so many layers upon layers that at times I do ''just keep going to the end'' of the videos. Nevertheless I still manage to learn key aspects and new terms from your excellent teaching abilities. There must be an incredible amount of work involved in creating these lessons.
I will purchase your book because you deserve some form of appreciation and it'll serve as a great reference resource. Much respect Josh and thanks, Kieron.

colekeircom
Автор

Josh! Thanks for this video, it has been easier for me to see the matricial representation of the computation than using the previous arrows. I really appreciate your explanation using matrices!

BaronSpartan
Автор

Josh Starmer is the GOAT, thank you, dear Josh.

jamesmina
Автор

DUDE JOSH, FINALLY! I have been waiting for this episode for a year or more. I’m so proud of you bro. You got there!

mraarone
Автор

This is really good. The simple example you used was very effective for demonstrating the inner workings of the transformer.

TheCJD
Автор

Amazing, thank you Josh. You deserve millions more subscribers

MakeDataUseful
Автор

Thanks for introducing the concepts about transformers

liuwingki
Автор

statquest's the best thing i ever found on the internet

Aa-fkjg
Автор

always been a huge fan of the channel and at this point in my life this video really couldn't have come at a better time. Thanks for enabling helping us viewers with some of the best content on the planet (I said what I said)!

roro
Автор

Your videos are a didactic stroke of genius! 👍

NewsLetter-sqeh
Автор

I will recommend this video to my friends who wants to study transformer ❤❤

sachinmohanty
Автор

Please Add this video in your Neural Network Playlist. I recently started watching that playlist

adityabhosale
Автор

Wow Sqatch! Long time no see my friend! Good to see you.

Your videos are so much fun that one does not feel we are actually in the class. Thank you Josh.

itsawonderfullife
Автор

Amazing video! Can't wait for the next one. By the way, I think there's a small typo at 5:15 where the first query weight in the matrix notation should be 2.22 instead of 0.22

Erkthbs
Автор

Thanks a lot, keep going please please

mortezamahdavi
Автор

Thanks for the great contents! One minor thing - at 5:24 minute, the first element of the Query weight matrix should be 2.22, but not 0.22

Keshi-lzef