NExT-GPT: Any-to-Any Multimodal LLM

preview_player
Показать описание
In this video we explain NExT-GPT, a multimodal large language model (MM-LLM), that was introduced in a research paper titled: "NExT-GPT: Any-to-Any Multimodal LLM".

We carefully review the NExT-GPT framework, explaining its different components, to understand how it is capable of using a LLM as its core agent to both process input and generate output from multiple modalities.
We then review a multimodal conversation example to get a better intuition for what can be done with such a framework.
Next, we dive into how NExT-GPT was trained by explaining few diagrams from the paper.
Finally, we review interesting results from the paper.

👍 Please like & subscribe if you enjoy this content
-----------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------

Chapters:
0:00 Introduction & Motivation
1:03 NExT-GPT Framework
4:36 Conversation Example
5:32 Training NExT-GPT
8:40 Results
Рекомендации по теме
Комментарии
Автор

just use voice generating AI for the audio plz

KDlSW
Автор

please fix your audio... it's really bad

じある
Автор

The drawer's hand in your video is VERY distracting.

TheRealHassan
visit shbcf.ru