filmov
tv
An image is worth 16x16 words: ViT | Vision Transformer explained
Показать описание
Mom, it's the Transformers again! They have come to ruin my CNN building blocks! 🥺 An Image is Worth 16x16 Words: paper explained. Is this the extinction of CNNs? Long live the Transformer?
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Outline:
* 00:00 Pure Transformer for vision
* 01:17 How does it work?
* 03:58 The CNN Armageddon?
📄 Paper (not anonymous anymore): "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
-----------------------------------
🔗 Links:
#AICoffeeBreak #MsCoffeeBean #ComputerVision #ICLR2021 #MachineLearning #AI #research
Video contains emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Outline:
* 00:00 Pure Transformer for vision
* 01:17 How does it work?
* 03:58 The CNN Armageddon?
📄 Paper (not anonymous anymore): "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
-----------------------------------
🔗 Links:
#AICoffeeBreak #MsCoffeeBean #ComputerVision #ICLR2021 #MachineLearning #AI #research
Video contains emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)
An image is worth 16x16 words: ViT | Vision Transformer explained
An Image Is Worth 16x16 Words - Paper Explained
Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
An Image is Worth 16x16 Words Transformers for Image Recognition at Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Vision Transformer (ViT) - An Image is Worth 16x16 Words: Transformers for Image Recognition
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
[Paper Review] An Image is worth 16x16 words: transformers for image recognition at scale
MLT __init__ Session #7: An Image is Worth 16x16 Words
An Image is Worth 16x16 Words:Transformers for Image Recognition at Scale (Paper Explained)
ATTENTION | An Image is Worth 16x16 Words | Vision Transformers (ViT) Explanation and Implementation
(ViT) An Image Is Worth 16x16 Words | Paper Explained
Vision Transformer for Image Classification
ViT: An Image is Worth 16x16 Words Explained
Vision Transformer Visualisation (An image is worth 16x16 words)
Image Classification Using Vision Transformer | An Image is Worth 16x16 Words
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
[220728] An Image is worth 16x16 words Transformers for image recognition at scale
Vision Transformer(ViT) - Image is worth 16x16 words | Paper Explained
Paper Talks #1 - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
[Paper Review] ViT: An Image is Worth 16x16 Words:Transformers for Image Recognition at Scale
#19 ViT: An Image is Worth 16x16 Words
[Vision Transformer] An Image is Worth 16 x 16 Words : Transformer for Image Recognition at Scale
Комментарии