👑 LLaVA - The NEW Open Access MultiModal KING!!!

preview_player
Показать описание
🔗 LlaVA Links

❤️ If you want to support the channel ❤️
Support here:

🧭 Follow me on 🧭
Рекомендации по теме
Комментарии
Автор

Re: Time mark from 2:13~2:24
Officer: "Do you know why I pulled you over today sir?"
Me: "Yea, I gap like that all the time! Just give it a second, you'll remember!"

leafdriving
Автор

We are actually beyond the not hotdog phase. I love it.

mikeyjohnson
Автор

The distraction did not interrupt your flow. Impressive!🤩

KevinKreger
Автор

I really love your videos man 💚💚 . it's keep me updated with the AI tech (specially open source ones)

quotesspace
Автор

i want you to do a captcha test live, so everyone can make sure, you are not an advance multimodel with speech. we got a server latency at 2:13

im-notai
Автор

The video was informative sir, could you do the video about the evolution of Transformers to LLMs to LMMs.

It will be really helpful for us to learn.

mohankrishnan
Автор

i value these open-source model more than closed-source alternatives, even if the closed source version has better results.

TheXenonite
Автор

I just tried it and gave it a picture of vomit (Yes I have that lying aroung on my computer). Just wanted to see if it would recognize it as such.

This is what it said the pictuure was.

The image features a piece of food, possibly a meatloaf, sitting on a white surface. The food appears to be a mixture of meat and vegetables, with several carrots scattered around it. The carrots are in various sizes and positions, some closer to the main piece of food and others further away. The overall scene gives the impression of a close-up view of a delicious and healthy meal.

LOL!!!

stresstherapist
Автор

You are always on top of things, while I tread water. Barely.😂

JscottMays
Автор

Thank you for the video and detailed explanation, I learned a lot from it and am your subscriber. I'm looking forward to the release on how it is possible to use the model in Colab, since for me, apparently, this is the only opportunity to work with it. I would also like to clarify that in business applications, of course, I will have not one, but a series of pictures, since we are talking about automation of working with data. How can I organize work, for example, so that this model analyzes not one, but 100 pictures at once, or one by one? You need to write the appropriate code for Colab, be a programmer for this, do I understand correctly? I also tried to test the model in the Web application, as you showed in the video - it really understands the details of what is depicted quite well, but it is less skilled than GPT-4 in the specialized scientific topics that I need. I also asked it to construct an answer in the form of a phrase in a given format, adding to it in the right place the phrases that I require, as well as maintaining a certain limit on the length of the signature required by me. As a result, insertions are made in arbitrary places instead of the required ones, and the length of the answer does not correspond to the required number of characters. Again, GPT-4 (I use the free version, Bing AI) does this in 90% of situations without errors. What do I need to do to make LLaVa more responsive? Do I need to train it somehow? And one more question, regarding speed, when compared with Bing, generating the answer took several times longer. What ways are there to speed up the work, considering that I technically do not have the ability to run this model on my machine?
I would be grateful for brief answers to my questions, I wish you success in the development of the channel.

atdigit
Автор

tested on 200+ images: not perfect but very very good. Nice step in the right direction.

blender_wiki
Автор

I’m wondering how local llava and llama2 have an access to public Internet . And when u ask chatGPT will tell u dose not have an access to public Internet ?

besooab
Автор

How do you learn about the models when they just came out?

robin
Автор

but what is the purpose of the model? let us tell us what the image we already see is?

SAVONASOTTERRANEASEGRETA
Автор

wtff visual instruct tuning is (actually vqa) my 7th sem miniproject is somewhat replicating their work!

dumbol
Автор

How can I upload multiple images at once to the LLava model?

nixes
Автор

Hi, Please share video based on how to load multi model (llava) in locally.

narenkumar