Image Classification using Vision Transformer (ViT) in TensorFlow

preview_player
Показать описание
In this video, we will build a flower image classification using Vision Transformer, which is implemented from scratch in the TensorFlow framework using the Keras API.

Vision transformer (ViT) is a transformer-based architecture used in the field of computer vision, it is directly inspired by the use of Transformers in NLP tasks.

Timeline:
0:00:00 - Introduction
0:00:31 - Dataset Explanation (Flower Images Dataset)
0:01:00 - Importing, Seeding, Dataset and others
0:37:22 - Implementing the Vision Transformer (ViT)
1:04:30 - Training the Vision Transformer (ViT)
1:08:49 - Testing the Vision Transformer (ViT)
1:13:57 - Ending - SUBSCRIBE

Support:

Follow Me:
Рекомендации по теме
Комментарии
Автор

bro please put all the videos in playlists. So it is organised and the audience can binge watch. Thanks for the amazing content. keep it going.

teetanrobotics
Автор

Very accurate and useful tutorial. Thank you very much

hamedshokripour
Автор

Thank you, sir. It was a very helpful video. :)

aditisengupta
Автор

Hi - this was very helpful but can you please provide the code to inference on a single image - not a evaluate a whole set? I normally use when training with traditional classifiers but in this case this does not work. I don't know exactly what preprocessing to do before using model.predict() on the image I wish to inference with.

ahpacific
Автор

Very good[VIT].. request make a detailed video on GAn /condition GAN and their
implementation

nehal
Автор

Can you please show how we can evaluate the model based on performance matrix such as precision recall f1 score ETC.

shushankyadav
Автор

how can i modify this code for multilabel image classification?

waqarahmed
Автор

can you tell what is the accuracy and other performance matrix after the 500 epochs, also please show how to print the predicted classes.

kirtishyamsukha
Автор

can you show us how to make predictions on run time with a specific image by displaying it's output as well on screen using the .h5 weights?

ashmalvayani
Автор

Thanks a lot for the detailed video on image classification using Vit. Is there any way to find out or visualize the attention map as well in tensorflow.

aryam
Автор

Thank you for such detailed video.
Just one question; if the model size is such huge, where exactly we will use it. And how to maintain a competent inferencing time CNN based models.
what is its practical usage?
Thank you

rama_gpubhuyan
Автор

Can you explain the SAM model in tensorflow?

sithibanu
Автор

I don't understand how the transformed input shape which is (64, 1875) to (256, 1875). Please help me out to understand it?

ashishrana
Автор

amazing series, dear i am struggle with replace the dataset with another, could you explian how i could use another dataset, please

fayezalhussein
Автор

"x = MultiHeadAttention( num_heads=cf["num_heads"], key_dim=cf["hidden_dim"] )(x, x)" What does (x, x) mean here? Can anyone explain?

kenand
Автор

which gpu is being used here? and cant we do the same by using Jupiter?

loveofmylifesoumyarashmi
Автор

Can I use this for document classification for files like pdfs, text files, docx?

RahulDogra-sd
Автор

Hey! I am getting error while running test program...please do reply kal dikhana hai project😅

anitachoudhari
Автор

can u share you swim transformer video also with visualization

mariaachary
Автор

can i use transformers in regression problem

ahmedchaoukichami