Introduction to Vision Transformers | Original ViT Paper Explained

Показать описание

In this video we go back to the original important paper from Google that introduced Vision Transformers (ViT). Up until vision transformers, CNNs were dominating the computer vision domain. Since the invention of transformers with the Attention Is All You Need paper, various attempts were made to utilize transformers in computer vision. We explain the challenge with doing so and how ViT architecture is able to deal with that challenge.
We also review the reduction of inductive bias in vision transformers comparing to convolutional neural networks.

-----------------------------------------------------------------------------------------------

👍 Please like & subscribe if you enjoy this content

-----------------------------------------------------------------------------------------------

Chapters:
0:00 Introduction
0:55 Using Transformers as-is?
2:13 How ViT Works?
3:30 Inductive Bias