Illustrated Guide to Transformers Neural Network: A step by step explanation

preview_player
Показать описание
Transformers are the rage nowadays, but how do they work? This video demystifies the novel neural network architecture with step by step explanation and illustrations on how transformers work.

CORRECTIONS:
The sine and cosine functions are actually applied to the embedding dimensions and time steps!

Hugging Face Write with Transformers
Рекомендации по теме
Комментарии
Автор

this is great but would've loved if you could have taken a sample sentence as an input and show us how it transforms as it moves through the different parts of the transformer.
Perhaps an idea for the next video!

Leon-pnrb
Автор

I must say you’ve given the best explanation on transformers that’ve saved me lots of time studying the original paper. Please produce more vids like this, I would recommend the BERT family and the GPT family as well 👏👍

MinhNguyen-rolm
Автор

Wow, this was great, I have watched a no of videos on the transformer models, and they have all contributed to my understanding, but this puts everything together so neatly. Amazing, please keep making more such videos.

architkhare
Автор

I have been struggling with this architecture for an eternity now and this is the first time I really understood what's going on in this graphic. Thank you so much for this nice and clear explanation!

abail
Автор

12:56 encoder has hiddens state of key-value pairs, and in the decoder, the previous output is compressed into a query. The next output is produced by mapping this query and the set of keys and values.

sank_y
Автор

There is actually a small mistake at 12:56: The Ecnoders Output are the VALUES and keys! for the Decoders second self attention.
So it is: Value and Key from Encoder with Query from Decoder are combined.

from the "Self Attention is all you need" paper: "In "encoder-decoder attention" layers, the queries come from the previous decoder layer,
and the memory keys and values come from the output of the encoder."

RandomLogic
Автор

I used multiple sources to learn about the transformer architecture. Regarding the decoder part, you really helped me understanding what was the input and how the different operations are performed ! Thanks a lot :)

valentinfontanger
Автор

This video marks an end to my search for one place explanation of Transformers. Thanks a lot for putting this up! :)

jenishah
Автор

Man thanks for this video, reading a paper for newbie is super difficult, but such explanations like you've posted for key, value and query as well as reasoning for masking is very, very helpful. I subscribed to your channel and am looking forward for new stuff.

mrowkenesser
Автор

This tutorial is absolute brilliant, I have to see it again and read the illustrated guide, there are so many infos!! Thank you!!!

Dexter
Автор

Brilliant explanation with visually intuitive animations ! I rarely comment or subscribe to anything but this time I instantly do both after watching the video. And how coincidental it is that this was uploaded on my birthday. Hope to see more videos from you.

lone
Автор

Correction: The sine and cosine functions for the positional embedding are applied to the input embedding dimension, not the time steps! oof!

theaihacker
Автор

This is by far the best explanation I’ve ever seen on Transformer Networks. Very very well done

yishaibasserabie
Автор

Thanks for your explanation, very clean and well built in every argument about transformers. I was so lucky to get this video randomly on YouTube. Good job!

mariosessa
Автор

This is literally the best explanation of Transformers I have ever seen!

manikantansrinivasan
Автор

Nice work! Love the visuals for this abstract topic. Just found your channel. Keep em coming!!

CodeEmporium
Автор

That's a very good explanation imo! Thanks for taking the time to produce such a gem.

flwi
Автор

Half of it went through my head. Just beautiful. I'll watch it many more times.. That's how I know the content is gooood.

anshuljain
Автор

This illustrated explanation is just so well done
I'm a novice at Deep neuronal networks and just by looking at the video, I just understood everything !
Completely recommended to understand Transformers :)

Good work :D

elorine
Автор

My first read about Micheal Phi was "Stop Installing Tensorflow using pip for performance sake!" in TowardDataScience blog (as I recall you was "Micheal Nguyen" at that time). My first impression was like "oh this guy was good at explanation". Then I read his several blogs, and now here I am. I never knew that you have a channel. You are one of the best educator I've ever known. Thanks so much.

-long-