LLaVA: Now You can Chat with Your Images | GPT-4 is Supposed to have This

preview_player
Показать описание
Explore LLaVA, the cutting-edge AI research project that combines a vision encoder and Vicuna transformer-based language model for powerful multimodal capabilities. Learn how LLaVA excels in image captioning, visual question answering, and image retrieval, while offering incredible conversational AI abilities.

Uncover LLaVA's potential applications across healthcare, education, and entertainment as it masters complex visual information and natural language understanding. Don't miss this glimpse into the future of AI! 💡🌐🏥🎓🎬

LINKS:
-------------------------------------------------
-------------------------------------------------
All Interesting Videos:

Рекомендации по теме
Комментарии
Автор

Incredibly impressive! Wild how much progress these are making each day. I hope AI will be able to tell us if an image is doctored using this technology so it essentially solves its own problem.

wifiguy
Автор

Can you please do a demo for local installation? I think it’s nice to have this in our local computer?

phoenixphuong
Автор

The HTML it wrote was probably consumed by your browser. You probably need to inspect the page's HTML for the outputted section.

AnmAtAnm
Автор

Do you use any local LLM models? If so, which ones? I want to use OpenAssistant as it is open-source and constantly upgrading, but the quality as if now is just too low and isn't useful. I want to opt out from OpenAI since free version always hits limits and i need to refresh the page to use it. So ideally, i'd like to use a local model that's closest to GPT-4, but the thing is that every few days there is a newer model or different model, so I am hesitant to spend the time to install a local model just to find out that there is a better one coming out. What are your opinions?

SkyEther
Автор

Amazing, thanks for sharing this info!

kbqvist
Автор

Can it do OCR and speak with the response of the OCR?

MadhavanSureshRobos
Автор

Please show how to link Yolo to this model. This is possible right? Computer vision + LLM? Any model out there for that yet?

marilynlucas
Автор

not working sshhhh errro says
"NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.
"

sad

AaliDGr
Автор

just write "continue" and the html code will appear.

chehirdhaouadi
Автор

can it comment on the video or images with humor and sarcasm...?

max
Автор

everyday question : where is your course bro 😃 i can manage all tech related to online course for free as a gift

MahmoudAmmar