Vision Transformer for Image Classification Using transfer learning

preview_player
Показать описание
Step by Step Implementation explained : Vision Transformer for Image Classification using transfer learning.

*******************************************************
*******************************************************

In 2020, Google Brain team introduced a Transformer-based model that can be used to solve an image classification task called Vision Transformer (ViT). Its performance is very competitive in comparison with conventional CNNs on several image classification benchmarks.

Vision transformer (ViT) is a transformer used in the field of computer vision that works based on the working nature of the transformers used in the field of natural language processing.

#transformers #computervision
Рекомендации по теме
Комментарии
Автор

I've been searching for this tutorial for long time, and I can't express how thankful I am, Aarohi! Your YouTube channel is an absolute gem, and it truly deserves a multitude of subscriptions. The way you effortlessly share your expertise is not only enlightening but also engaging. Keep up the exceptional work!

dr.noushathshaffi
Автор

Thanks Aarohi, it is brilliant. Great Help to learn ViT

shounakdas
Автор

Very informative tutorial, Thank you. I have the following questions and doubts-
1) During training, how to save the best model only after each epoch, and load that best model after completing training, for future use? (e.g. based on lowest validation loss)
2) How to generate the confusion matrix and also the F-1 Score, Precision, Recall?
3) Finally how to identify actually which test samples are correctly predicted and which test samples are not?
4) Since, after initial 4-5 epochs the gap between training loss and test loss or between train accuracy and test accuracy is increasing continuously, so it needs further fine-tuning, so, please suggest how to do that.

debjitdas
Автор

Hello Ma’am
Your AI and Data Science content is consistently impressive! Thanks for making complex concepts so accessible. Keep up the great work! 🚀 #ArtificialIntelligence #DataScience #ImpressiveContent 👏👍

soravsingla
Автор

I am getting the error "ModuleNotFoundError: No module named 'going_modular'" even though the going_modular folder and the Notebook are under the same folder. I am working in Colab. Please Help Ma'am.

JKaks-grzm
Автор

Thanks so much, I was waiting this video from you.

danielasefa
Автор

Thank you! Your video is very informative!

НиколайНовичков-еэ
Автор

please make a landmark detection here in vision transformer. i greatly in need for this project to be finished and the task is to create a 13 landmark detection using vision transformer. and i cant find any resources that teaches how to do a landmark detection if vision transformer. this channel is my only hope.

sanjoetv
Автор

Good day. Thank you for this wonderful demo. I have a few questions:

1. Are there any other existing vision transformer models that you know of?

2. How do I go about training a model using images corresponded with nutritional values in a certain column range within a separate excel database and spitting out the values predicted when applied to a single image? The name on each image is also identified against each value within the excel file.

Many many thanks in advance for the assistance. :)

ambikajadoonanan
Автор

how to predict on very large dataset? lets say, you have 30, 000 images, then using for loop will be comp. expensive, so, what's the best way to inference from pretrained model on large datasets?

aakashyadav
Автор

Madam, I have one doubt...Here we use a pretrained model and we are training the model again with our dataset. So my doubts are from where do we get the pre trained model? And for which dataset the pretrained model got trained? Also, after retraining the model with our dataset, the weights will all get changed right?

anishmgeorge
Автор

Could you add how to calculate the confusion matrix and other metrics please?

FERNANDOVALLE-iggl
Автор

Awesome upload. How do I save the model or weights which I can load and perform inference later?

sathishkumars
Автор

I am getting this error "
ModuleNotFoundError: No module named 'going_modular'"
when trying to run it on google colab .
how to fix it in google co lab.plz reply

swatimishra
Автор

I combine ur code and my code of training process. Add Learning rate scheduler and GPU memory gc. The result and speeds of training become so much beautiful without worry about GPU out of memory

hulkbaiyo
Автор

Thank you soo much mam for this amazing video

Sunil-ezhx
Автор

Hi again, when I print the summary of the Vision Transformer, the Input Shapes for each Layer start with 32. I understand that the very first input [32, 3, 224, 224] means we have originally have an image size 224x224 with 3 colour channels. What does the 32 mean? Is that the batch size, and if so, do I have to change that value if I change my batch size for training?

MaryBrockyn
Автор

Code with Aarohi is Best YouTube channel for Artificial Intelligence
#BestChannel #YoutubeChannel #ArtificialIntelligence #CodeWithAarohi #DataScience #Engineering #MachineLearning #DataAnalysis #BestLearning #LearnDataScience #DataScienceCourse #AytificialIntelligenceCourse #Codewithaarohi #CodeWithAarohi

soravsingla
Автор

thank you, very good explanation . which pre-trained model you are using here, is that tey are same as cnn pre trained model or you are using only the weights of the pre trained model ? which pre trained model is this >?

nandiniloku
Автор

ma'am how do i save and then load the model....since after saving and loading the model, i am not able to get the same predictions..is there any resources i can refer to learn about it

Vibhu-tsdh