Image Annotation with LLava & Ollama

preview_player
Показать описание

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:10 Image Captioning
00:00 Basic Idea of the Image Captioning ap
01:32 Image Captioning Diagram
01:41 Step 1: Get the file list from a folder
01:54 Step 2: Loading the files
02:24 Step 3: Send the file to LLaVA 1.6 via Ollama
03:56 Step 4: Saving the results back tothe DataFrame
04:24 Step 5: Save the DataFrame to CSV
04:59 Code Time
Рекомендации по теме
Комментарии
Автор

Yes please on tutorial building even more functionality of this example! 😀

IdPreferNot
Автор

These are the 4 questions i ask llava and then I put the results manually in the comment section of the exif metadata:

describe this image in great detail

write the 10 most relevant questions for this image

answer the 10 above questions in the correct order

write the 20 most relevant tags for instagram

I will try to automate this workflow to keyword my photo collection, thanks for this tutorial!

MassimilianoGrecoPh
Автор

You can save this text meta back to your image files to EXIF, so it will be always going hand-to-hand without the need of extra files lying around

alx
Автор

This is great ! With the idea to put the result to the exif metadata, this would be awesome 😎

pnddesign
Автор

This is right into the awesomeness space! Thanks for sharing this project! (yesterday I was working on a similar solution using ComfyUi + Python exporting but this is way cleaner)

LaHoraMaker
Автор

Did you add custom rag? Capture snapshots from webcam. I’m trying to learn for days getting stuck on rag. 🎉

antonpictures
Автор

Thanks for sharing this is very useful and its a good source that i keep coming back to

mrpasak
Автор

This is such a cool example. I was looking for this for a long time. Cheers for that !

chorton
Автор

Very interesting and insightful. Thank you very much, Sam.

guanjwcn
Автор

Great video, really what I was looking for, some useful real world cases on how to use LLM models locally, (instead of paying a company to do this for us, of course more secure and private). What I would love to see, is how to integrate this example to create a Tweet for us about the image, store it in the CSV file, and then be able to post the image with that tweet at intervals directly, maybe using Twitter's API? Not very tech savvy myself, but very interested in putting LLMs to some real world use and automation. Thanks for making these videos.

carlosterrazas
Автор

Awesome video. appreciate you demystifying the process and tying in queuing, dataframe, and rag concepts some powerful stuff. Will be interesting to do an apple to apple comparison with GPT Vision and Gemini Vision functionality.

donb
Автор

Excellent !!! Was just playing around with moondream. Perfect timing ;)

miriamramstudio
Автор

Good job, and thank you for again sharing your knowledge showing us how to do useful stuff. I'd also be interested in seeing how you create a professional web user interface for this and other projects going forward. What are some good ways of doing this which are easy to make look good and modern, and which run on all major browsers?

brianhauk
Автор

Love it!!! Great content and super Ollama in action!

IvanFioravanti
Автор

Maybe i would be useful to save the generated description in the image file itself, for example in exif/iptc description field if image file supports it.

CezarPopescu
Автор

that is what i exactly looking forward thanks a lot

lazut
Автор

Please do some more examples of identifying difficult screen shots.

Have you also thought about how boxing could improve this process?

christopherd.winnan
Автор

why pass the file to ollama as bytes and not an image file ? is it faster that way? Also do you know any hacks for ollama to return precisely a specific number of words (or a range) every time ?

squiddymute
Автор

llava:34b-v1.6 running very slowly and not using GPU whereas llava:13b-v1.6 working fine.
my system specs
Ram: 32 GB
Gup: nvidia3060 12GB

edits_for_fun
Автор

can you do a tutorial bout AI agent, for image and video

fintech
join shbcf.ru