Vision Transformer for Image Classification

preview_player
Показать описание
Vision Transformer (ViT) is the new state-of-the-art for image classification. ViT was posted on arXiv in Oct 2020 and officially published in 2021. On all the public datasets, ViT beats the best ResNet by a small margin, provided that ViT has been pretrained on a sufficiently large dataset. The bigger the dataset, the greater the advantage of the ViT over ResNet.

Reference:
- Dosovitskiy et al. An image is worth 16×16 words: transformers for image recognition at scale. In ICLR, 2021.
Рекомендации по теме
Комментарии
Автор

Great Explanation with detailed notations. Most of the videos found in the YouTube were some kind of oral explanation. But this kind of symbolic notation is very helpful for garbing the real picture, specially if anyone want to re-implement or add new idea with it. Thank you so much. Please continuing helping us by making these kind of videos for us.

UzzalPodder
Автор

Can't stress enough on how easy to understand you made it

mmpattnaik
Автор

These are some of the best, hands-on and simple explanations I've seen in a while on a new CS method. Straight to the point with no superfluous details, and at a pace that let me consider and visualize each step in my mind without having to constantly pause or rewind the video. Thanks a lot for your amazing work! :)

drakehinst
Автор

Clear, concise, and overall easy to understand for a newbie like me. Thanks!

adityapillai
Автор

great expalation! Good for you! Don't stop giving ML guides!

ai_lite
Автор

The best video so far. The animation is easy to follow and the explaination is very straight forward.

drelvenkee
Автор

The best ViT explanation available. Also key to understand this for understanding Dino and Dino V2

thecheekychinaman
Автор

Man, you made my day! These lectures were golden. I hope you continue to make more of these

sheikhshafayat
Автор

Amazing, I am in a rush to implement vision transformer as an assignement, and this saved me so much time !

valentinfontanger
Автор

Amazing video. It helped me to really understand the vision transformers. Thanks a lot.

aimeroundiaye
Автор

15 minutes of heaven 🌿. Thanks a lot understood clearly!

thepresistence
Автор

Very good explanation, better that many other videos on YouTube, thank you!

vladik
Автор

This reminds me of Encarta encyclopedia clips when I was a kid lol! Good job mate!

swishgtv
Автор

Thank you for your Attention Models playlist. Well explained.

arash_mehrabi
Автор

This was a great video. Thanks for your time producing great content.

MonaJalal
Автор

You have explained ViT in simple words. Thanks

rajgothi
Автор

Thank you, your video is way underrated. Keep it up!

DerekChiach
Автор

Thank you so much for this amazing presentation. You have a very clear explanation, I have learnt so much. I will definitely watch your Attention models playlist.

sehaba
Автор

good video, what a splendid presentation, wang shusen yyds.

wengxiaoxiong
Автор

Nicely explained. Appreciate your efforts.

nehalkalita