Pytorch Image Captioning Tutorial

preview_player
Показать описание

❤️ Support the channel ❤️

Paid Courses I recommend for learning (affiliate links, no extra cost for you):

✨ Free Resources that are great:

💻 My Deep Learning Setup and Recording Setup:

GitHub Repository:

✅ One-Time Donations:

▶️ You Can Connect with me on:

OUTLINE:
0:00 - Introduction
0:12 - Explanation of Image Captioning
05:15 - Overview of the code
06:07 - Implementation of CNN and RNN
20:03 - Setting up the training
30:36 - Fixing errors
32:18 - Small evaluation and ending
Рекомендации по теме
Комментарии
Автор

How is it that you are so good at explaining?
Keep up the good work champ.

ashkankhademian
Автор

u r such a great engineer!
I found out this vid sooo useful!!
Thanks!!!!

백이음
Автор

thank you very much for your videos, please continue your work, many people need your video

nunenuh
Автор

since you feed the feature vector at timestamp-0 so at inference time we also only feed the feature-vector at timestamp-0 we not have to provide the start token in the test phase

HARIS-qn
Автор

That was a very Aladdin tutorial, thank you!

oskarjung
Автор

Awesome complete tutorial, thank you.

Bobobhehe
Автор

3:37 feed predicted words as input, difference connection for inference and training

vincentchong
Автор

looking forward to new videos. awesome!

garikhakobyan
Автор

OK, I have known it. Excellent Pytorch Tutorial.

junhuajlake
Автор

what are the benefits of using lstm instead of transformers in this specific image to text task?

zehrayavuz
Автор

Awesome tutorial, followed it till the end. I have a question, where do we split the training and test set? and how as there are image data and caption data too. Can you help me with that?

NutSorting
Автор

Hi Aladdin, thanks for the awesome tutorials.
Could you please elaborate on 27:51, this statement
outputs=model(imgs, captions[:-1])
Why are we ignoring the last row ? The last row would mostly contain padded characters, and very few EOS indexes. Could you please explain how ignoring the last row works in this context ?
Thanks

rahulseetharaman
Автор

Amazing tutorial!!
Can we do it using the transformer instead of LSTM?

rohinim
Автор

Thanks alot! one important question:
In the training loop the loss is calculated from scores and the captions which are the target.
there is no shifting to the right of the target captions. Without doing so how does the model still knows to learn the next word? Is there an internal pytorch method that does so implicitly? I tried to look and i dont understand how in this way the loss can be calculated in a way such the model would learn to predict the next word

MatanFainzilber
Автор

Hi I wana know how you had practiced pytorch in your learning journey and so comfortable writing it as till of now i can just write some simple structure but these type, need your help regarding this.

amaulearyan
Автор

Great tutorial!!! But how to save model?

verakorzhova
Автор

can we have a demo on visual question generation also?

soumyajahagirdar
Автор

Hi Aladdin, thanks so much for this awesome series of videos. Could you please explain how to use BERT instead of RNN in this model ? thanks in advance

aboalifan
Автор

Please make one vedio for attention in audio processing ex. Speech emotion

krishnachauhan
Автор

very good work . please make some videos on medical imaging . thanks

muhammadzubairbaloch
welcome to shbcf.ru