Image Caption Generator using Flickr Dataset | Deep Learning | Python

preview_player
Показать описание
⭐️ Content Description ⭐️
In this video, I have explained on how to develop a image caption generator using flickr dataset in python. The project uses keras & tensorflow framework for the implementation. It uses both image features and text features in the project for building the model. This will give a better understanding of how we can leverage the model architecture of different domains for a specific application.

Make a small donation to support the channel 🙏🙏🙏:-

🕒 Timeline
00:00 Introduction to Image Caption Generator
01:00 Import Modules
07:26 Extract Image Features using VGG16
18:38 Load Captions Data
26:43 Preprocess the Caption Data
38:15 Train Test Split
40:28 Create Data Generator Function
53:34 Model Creation - CNN LSTM
01:06:10 Generate Captions for Images
01:19:46 Visualize the Results of Image Caption
01:27:36 Improving the Results

#imagecaptioning #deeplearning #hackersrealm #imagecaptiongenerator #flickr #imagecaption #machinelearning #datascience #model #project #artificialintelligence #beginner #analysis #python #tutorial #aswin #ai #dataanalytics #data #bigdata #programming #datascientist #technology #coding #datavisualization #computerscience #pythonprogramming #analytics #tech #dataanalysis #iot #programmer #statistics #developer #ml #business #innovation #coder #dataanalyst
Рекомендации по теме
Комментарии
Автор

Hey Hackers,
I have updated the code to test the model with new image/image url along with flickr32k dataset for better results. You can find the latest code in the description.
For users getting the following error:
`output_signature` must contain objects that are subclass of `tf.TypeSpec`

Please update the code snippets in data_generator and model creation like I updated in my website/description link. It's working with latest version of tensorflow as well without issues.

Happy Learning!!!

HackersRealm
Автор

Appreciated your project details. It took me almost 3 weeks to reproduce similar results.

JannatulFerdous-ewko
Автор

By the way, I forgot to thank you for all this excellent explanation. You, sir, are a truly great person. I am very grateful to you.

MsBothyna
Автор

Thank you so much for such detailed step by step explanation..😊🙏

its-itish
Автор

thank u sir best explained IC video so far

mohamedsahli
Автор

Beautiful explanation! Thanks for this!

plabmadeeasy
Автор

Thanks for the wonderful video, code and explanation

sreelakshminarayanan.m
Автор

why you didn't use keras imagedatagenerator to extract the features from the model. It create whole image preprocessing pipeline so you don't have to do it manually. Btw great tutorial!

rishabhvyas
Автор

Without connecting colab to kaggle, how can we do this while the base directory is our pc's c drive?

isura_AK
Автор

Kaggle in "Accelerator" tab now provides even TPU, out of 4 options shown in the drop down, which to choose?

prodevmahi
Автор

It's a wonderful project and I could easily get the output by following your instructions, but after completing everything, if I try predicting the output for a new image, the output is not relevant, how can I correct this, It would be very helpful if you could help us do this . Thank you

harshith
Автор

this is the best video and so perfectly explained. sir can you please make a video on video captioning using MSVD dataset. thankyou 👍🏼

tanviladdha
Автор

I'm getting error 'int' object not iterable in model.fit(generator, epochs=1, steps_per_epoch = steps, verbose =1)

nivedansharma
Автор

Hello sir. I am doing this project but using EfficientNetV2B0 and GRU. But my bleu1 score is not getting more than 0.22. What needs to be changed? Is it possible to get bleu1 score more than 0.5? also, how can we load this model so that retraining is not required and how to implement it in the GUI

mayur
Автор

What is the use of batch size and dense layer??✨✨

saithota
Автор

Thanks for wonderful implementation 😊. I run it successfully but
Can you tell me how context.txt fill is created because I saw that our input image should be in particular format and we get correct results only for 8000 images.
Is it possible for other images? and I think it's not extracting text from image, it's extracting from context.text file.
If I am wrong then please correct me.
Thank you 😊

pradnyeshdoshi
Автор

How to do it like i will provide any random google image and it will prove caption according to that

Carbon
Автор

Load the model file vgg16 is error 😢😢 how can i resolved??

solutiontolifetarotreading
Автор

Thanks for this video, explained well! Can the model predict on monuments and historical structures? I mean can the model predict on totally unseen data and can you please make a video of how to put entity awareness on top of it

sudeshnakundu
Автор

In the original extract features from image step, I followed your steps and displayed 'Error displaying widget: model not found'. How to solve it? I've been looking for a solution for a long time, but there's no solution.

CYS-pl
join shbcf.ru