Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.

preview_player
Показать описание
In this video I teach how to code a Transformer model from scratch using PyTorch. I highly recommend watching my previous video to understand the underlying concepts, but I will also rehearse them in this video again while coding. All of the code is mine, except for the attention visualization function to plot the chart, which I have found online at the Harvard university's website.

It also includes a Colab Notebook so you can train the model directly on Colab.

Chapters
00:00:00 - Introduction
00:01:20 - Input Embeddings
00:04:56 - Positional Encodings
00:13:30 - Layer Normalization
00:18:12 - Feed Forward
00:21:43 - Multi-Head Attention
00:42:41 - Residual Connection
00:44:50 - Encoder
00:51:52 - Decoder
00:59:20 - Linear Layer
01:01:25 - Transformer
01:17:00 - Task overview
01:18:42 - Tokenizer
01:31:35 - Dataset
01:55:25 - Training loop
02:20:05 - Validation loop
02:41:30 - Attention visualization
Рекомендации по теме
Комментарии
Автор

personally, I find that seeing someone actually code something from scratch is the best way to get a basic understanding

comedyman
Автор

It also includes a Colab Notebook so you can train the model directly on Colab.

Of course nobody reinvents the wheel, so I have watched many resources about the transformer to learn how to code it. All of the code is written by me from zero except for the code to visualize the attention, which I have taken from the Harvard NLP group article about the Transformer.

I highly recommend all of you to do the same: watch my video and try to code your own version of the Transformer... that's the best way to learn it.
Another suggestion I can give is to download my git repo, run it on your computer while debugging the training and inference line by line, while trying to guess the tensor size at each step. This will make sure you understand all the operations. Plus, if some operation was not clear to you, you can just watch the variables in real time to understand the shapes involved.

Have a wonderful day!

umarjamilai
Автор

I have browsed YouTube for the perfect set of videos on transformer, but your set of videos (the video explanation you did on the transformer architecture) and this one is by far the best !! Take a bow brother, you have really contributed to the viewers in amount you cant even imagine. Really appreciate this !!!

ArslanmZahid
Автор

Greeting from China! I am PhD student focused on AI study. Your video really helped me a lot. Thank you so much and hope you enjoy your life in China.

yangrichard
Автор

Thank you Umar for our extraordinary excellent work! Best transformer tutorial ever I have seen!

aiden
Автор

One of the best tutorial to understand and implement the Transformer model...Thank you for making such a wonderful video

faiyazahmad
Автор

This video is incredible, never understood it like this before. I will watch your next videos for sure, thank you so much!

maxmustermann
Автор

Thanks a lot for such a detailed video. Your videos on transformer are best.

shresthsomya
Автор

Keep doing what you are doing. I really appreciate you taking out so much time to spread such knowledge for free. Been studying transformers for a long time but never have I understood it so well. The theoretical explanation in the other video combined with this practical implementation, just splendid. Will be going through your other tutorials as well. I know how much time taking it is to produce such high level content and all I can really say is that I really am grateful for what you are doing and hope that you continue doing it. Wish you a great day!

abdullahahsan
Автор

best video I have ever seen on whole youtube eon transformer model. Thank you so much sir!

raviparihar
Автор

Thanks for making it so easy to understand. I definitely learn a lot and gain much more confidence from this!

shakewingo
Автор

Thank God, it's not one of those 'ML in 5 lines of Python code' or 'learn AI in 5 minutes'. Thank you. I can not imagine how much time you must have spent on making this tutorial. thank you so much. I have watched it three times already and wrote the code while watching the second time (with a lot of typos :D).

MuhammadArshad
Автор

Dear Umar, your video is full of knowledge; thanks for sharing.

abdulkarimasif
Автор

Hey there! I enjoyed watching that video, you did a wonderful job explaining everything, and I found it super easy to follow along. Overall, it was a really great experience!

lyte
Автор

Dear Umar - thank you so much for this amazing and very clear explanation. It has deeply helped me and many others in understanding the theoretical and practical implementation of transformers! Take a bow!

SaiManojPrakhya-mpoe
Автор

WOW WOW WOW, though it was a bit tough for me to understand it, I was able to understand around 80 % of the code, beautiful. Thank you soo much

manishsharma
Автор

This is such a great work, I don't really know how to thank you but this is an amazing explanation of an advanced topic such as transformer.

VishnuVardhan-sxbq
Автор

Thanks for your detailed tutorial. Learned a lot!

goldentime
Автор

Hi Umar. I am a first year student at MIT who wants to do AI startups. Your explanation and comments during coding were really helpful. After spending about 10 hours on the video, I walk away with great learnings and great inspiration. Thank you so much, you are an amazing teacher!

physicswithbilalasmatullah
Автор

Really great explanation to understand Transformer, many thanks to you.

dbnbwzz