How to Make Your Images Talk: The AI that Captions Any Image

Показать описание

Image captioning is the process of taking an image and generating a caption that accurately describes the scene. This is a difficult task for neural networks because it requires understanding both natural language and computer vision.

In this video, I discuss my complete approach to this problem. For visual understanding, we will use Inception V3, and for natural language understanding, we will first use RNN, but it will fail to generalize well on unseen data, therefore we will shift to Transformer. And as you will see, Transformer will nail it!

Source Code:

🔗 Social Media 🔗

Timestamps:
00:00 Introduction
00:16 Quick overview of Image Captioning
01:08 The Model Architecture (RNN)
01:56 Getting the Image feature vectors using Inception V3
04:39 What Attention Mechanism is doing?
05:10 Choosing the Dataset
05:56 Data Preprocessing
06:54 Training!!!
07:13 Checking the results
09:24 Over Dramatic Transformer Introduction
10:25 Why I used COCO Dataset
11:12 Side-by-side result of RNN and Transformer
11:59 Deploying model to HuggingFace so anyone can use it!

#artificialintelligence #ai #deeplearning #machinelearning #transformer #transformers

Thank You,
Pritish Mishra

Рекомендации по теме

Комментарии

I can't believe your video views bcz your explaination is on next level dude i thought it must have crossed atleast 1lakh but i hope it will soon cross it

yashwantrana

it's not a tutorial it's a movie i really enjoy it💙

محمدالفقى-يب

Amazing video! You made it interesting and practical. The memes and effects were lit.

gabip

Awesome Video bro !! You explained Image captioning in a simple and fun way.

hugehammer

I just started to realize the potential of AI, I already feel behind with all these new tools. Would love to see another video in the future about BlueWillow that is completely free

GolpokothokRaktim

Bro unable to get, Image caption using RNN. The link is not working. Can you please check.

EM-nrhj

Awesome video 🔥 and nice animation as always (or not it was more dramatic 😂😂😂) Way to go 👍🏻👍🏻👍🏻

dhiraj

It’s a good tutorial. But I have a question regarding attention mechanism. At 4:50, how doest it know to focus on dog getting "dog" words as input? If it knows by detecting object, then how does it know to focus on somewhere else when it receives Please make it clear.

Waliul_The_Wall-E

Just amazing! Loved this video. Keep more coming!

RudranilBhattacharjee

learned so many new things. thanks for making the video.

sapnilpatel

When I run the code on streamlit it shows two errors:
1. ValueError: axes don't match array.
2. ValueError: The name "conv2d" is used 2 times in the model. All layer names should be unique.
How can I solve the problem?

rasilmaharjan

How to use the saved model weights model.h5 in another file to make inferences on new images

venkatavivek

hey i looked into your kaggle notebook of transformer model with coco dataset, you mentioned that you only trained the model on 14k images for coco dataset, im a beginner in ml, so can you tell me what should i change in your code to increase the training dataset size from 14k

hellotherethere

Hey, great lecture! Just need a help, the link for the google colab for image captioning with rnn isn't working. It would be great help if you'll provide a new link. Thankyou!!

dishadubey

How can I save the model and run for android studio?

joycemalubay

Hi Pritish, amazing tutorials. Thank you. While running the transformers colab book getting error at -
----> 4 pred_caption = generate_caption(img_path). TypeError: `x` and `y` must have the same dtype, got tf.uint8 != tf.float32.

Can you please help!

aady

Nice video. How long does it take you to train the transformer model?

drafatkarim

link isn't working for "Image Captioning with RNN". @PritishMishra can you please share the code

akashbhavsar

Image captioning with RNN source code is not opening dude please upload 😊.

GANGADHARTHOTAKURA

Amazing video, where did you learn all of this? omg just saved me so much time. Life safer

BoloFofoPT

How to Make Your Images Talk: The AI that Captions Any Image

How to make your images GLOW (Orton Effect)

How To Make Video Of IMAGES HUGGING | Create Photo Hug Video | TikTok Trend 2024

This Prompt Strategy Unlocks Stunningly Realistic AI Images

How To Make Your Images POP! — Photography Visual Patterns #5

How To Make Squint Your Eyes Meme Images

How to make AI ram sita images🤖🤔#shorts #trending #viral #artficialintelligence #rammandir #ram...

Using AI to make stunning images with Leonardo AI

How to make Ai trend images FREE

Free Flux API: Generate Images with Hugging Face in Make.com Automation

How to Make AI Images/Videos of Yourself! (For Free)

7 Ways to Make Money in Adobe Stock with AI Generated Images

How To Make AI Images Of Yourself (Free)

How To Make Your White Toner Images MUCH SOFTER & LAST LONGER! USE A T SEAL...GAMER CHANGER!

How To Make Squint Your Eyes Images For Free

Images that will make you Uneasy Part 2 😳

How To Make Your Images Come Alive!

How To Make Celebrity Talking Images With AI Free Of Cost #Virbo

How to make your images less boring! [Tutorial]

CLICK IT! How to make product images for Amazon! #shorts

How To Make AI Trend Images for FREE | Free AI Images 2024

Make your images move using Generative AI

Free AI Upscaler - How to make your images HD tutorial #graphicdesign #adobe #photoshop #tutorial

3 Ways to Make Your Images Look More 3D in Photoshop

How to Make Your Images Talk: The AI that Captions Any Image