MemGPT 🧠 Giving AI Unlimited Prompt Size (Big Step Towards AGI?)

preview_player
Показать описание
In this video, we look at MemGPT, a new way to give AI unlimited memory/context windows, breaking the limitation of highly restrictive context sizes. We first review the research paper, then I show you how to install MemGPT, and then we have special guests!

Enjoy :)

Join My Newsletter for Regular AI Updates 👇🏼

Need AI Consulting? ✅

Rent a GPU (MassedCompute) 🚀
USE CODE "MatthewBerman" for 50% discount

My Links 🔗

Media/Sponsorship Inquiries 📈

Links:

Chapters:
0:00 - MemGPT Research Paper
25:10 - MemGPT Installation Tutorial
30:21 - Special Guests!
Рекомендации по теме
Комментарии
Автор

So who’s building something with AutoGen + MemGPT?

matthew_berman
Автор

As an older developer, we used 'virtual memory' because in 1989 computers only had 640k and in DOS, there was no OS memory management. We would swap CAD/CAM geometry objects in and out of memory as they were needed.
Please keep us informed as this project moves forward especially when it can use open source LLM's.

davidbaity
Автор

AGI would be impossible without a memory system, so I agree this is another step towards it. It's really cool.

amj
Автор

Your channel has distinctly carved its niche in the AI YouTube arena. Among the myriad of AI YouTubers I'm subscribed to, your channel, particularly over the last six months, has excelled in quality, presentation, and professionalism. Your videos have become my go-to source, superseding others that now seem laden with filler content.

Your knack for diving straight into the core topic, elucidating not only the 'what' but the 'why, ' is refreshing. The structured walk-throughs, practical guidance, and anticipatory glimpses into the future keep me engaged throughout. Your closing phrase, "And...I'll see you in the next one, " has amusingly become a segment I look forward to; it encapsulates the essence of your engaging delivery.

Being a part of your channel feels like being immersed in a thriving community. The clear, concise factual delivery, balanced with simplicity, makes the content accessible for newcomers while remaining enriching. Despite the crowded space of AI discussions on YouTube, your channel effortlessly ranks within my top 10.

Thank you for the enriching content and the community you've fostered.

middleman-theory
Автор

Separating the conversation from an internal dialogue the system can have will prove very helpful: you can ask where the system has learned something to prevent hallucinations, have a space to run logical reasoning until confirmation, and now spout, “The ball has to be 10c and the bat $1.10… Wait, no.”

bertilhatt
Автор

🎯 Key Takeaways for quick navigation:

00:00 🧠 AI currently lacks memory beyond training data and is limited by its context window.
00:29 📈 Progress has been made to increase context window size, but still limited (e.g., Chat GPT-4 has 32, 000 tokens).
00:58 📚 Introducing MemGPT: A solution to expand AI's memory. The video reviews this research and the open-sourced code.
01:11 ✍️ Paper titled "M GPT: Towards LLMs as Operating Systems" has several authors from UC Berkeley.
01:51 🗣️ Limited context window issues arise especially in long-term chat and large document analysis.
02:20 💽 MGPT mimics computer OS memory management, with an "appearance" of large memory resources.
03:27 📊 Increasing context window in Transformers is not optimal due to computational and memory costs.
04:08 🔄 MGPT autonomously manages its memory through function calls, enhancing its ability.
04:52 🖥️ Diagram explanation: Inputs go through parsers, get processed in virtual contexts (main and external), and get outputted after further processing.
06:14 🖱️ MGPT allows AI to self-manage context, treating longer context as virtual memory and own context as physical memory.
06:40 📟 Main context (like RAM) has a size limit while external context (similar to a hard drive) is virtually unlimited.
07:08 📏 Various models have different token limits, impacting how many messages can be processed.
07:48 ⚠️ Actual usable context is often less than advertised due to system messages and other requirements.
09:00 🔄 Recursive summarization is another way to manage limited context, previously discussed in another video.
09:15 🧠 MemGPT stores its "memories" in a vector database, but it eventually compresses them through a process called "reflecting on memories" to manage space.
09:56 🔄 Recursive summarization can address overflowing context but is lossy, leading to gaps in the system's memory, much like video compression degradation.
10:38 📝 MemGPT splits context into: system instructions, conversational context (recent events), and working context (agent's working memory).
12:02 🎂 MemGPT can store key information from conversations in its working context, as shown by a birthday conversation example.
12:43 💽 External context acts as out-of-context storage (like a hard drive), separate from the main context but can be accessed through function calls.
13:25 🔍 There are two types of external contexts: recall storage (history of events) and archival storage (general data store for overflow).
14:09 🧩 MemGPT manages its memory using self-directed memory edits and retrievals, executed via function calls and based on detailed memory hierarchy instructions.
15:32 🔄 MemGPT can correct its memory when false information is detected, updating its stored context.
16:14 🤖 The effectiveness of MemGPT as a conversational agent is evaluated based on its consistency (alignment with prior statements) and engagement (personalizing responses).
17:10 🎵 Through a function call, MemGPT can delve into its past memory to recall previous conversations, like discussing a music artist.
17:52 🕰️ Deep Memory Retrieval (DMR) enables the agent to answer questions that refer back to very specific details from past conversations.
18:05 📊 The accuracy of MGPT's responses is better than GPT 3.5 or GPT 4 alone.
18:19 🍪 Personalized conversation openers (like referencing a user's cookie preference) increase user engagement.
19:01 ☕ Examples illustrate how MGPT uses context and recall differently to engage with users.
20:12 📜 Many documents exceed the token limits of current models, creating challenges in document analysis.
21:06 🧠 Large language models exhibit a bias in recalling information towards the beginning or end of their context, mirroring human memory patterns.
22:44 📈 Charts indicate that MGPT maintains consistent accuracy regardless of the number of documents or nested information, unlike GPT 3.5 and 4.
23:12 ⚖️ A trade-off with MGPT is that some token budget is used for system instructions.
23:41 🤖 Discussion about LLMS as agents and their emergent behaviors in multi-agent environments.
24:21 💻 Tutorial on how to activate and use MGPT, starting with code setup.
27:35 📁 MGPT's document retrieval feature allows users to chat with their documents; using wildcards can fetch multiple text files.
28:15 💵 Embedding files come with a computational cost; example given shows 3 documents for 12 cents.
28:44 🔄 MGPT's persona is customizable, allowing users to tailor how the model interacts with information, like referencing archival memory.
29:38 🔍 MGPT can retrieve specific data from documents, such as annual revenues of companies.
30:06 🌐 Introduction to MGPT emphasized its rapid evolution and potential for open-source models in the future.
30:33 🎙️ Interview with MGPT authors Charles and Vivian discussing inspiration and plans for the project.
30:46 🧠 MGPT addresses the memory limitations of current language models by actively saving crucial data into a permanent memory store.

ytpah
Автор

Yes please do another tutorial with MemGPT! This is huge!

redbaron
Автор

Man! I for one, am fully ready to welcome our AGI overlords!

stickmanland
Автор

Love this video and thanks so much for the in-depth post! Accurately explains the theoretical science along with the practical implementation 🙏🏾

chrismadison
Автор

I think the information is valuable and is explained up to the point where you can't understand more without a deep dive into AI. Good job!

RonnyMW
Автор

Matthew thank you so much for keeping us up to date. Your rock. Can't wait to play with this.

djzuela
Автор

Some of the new Mistral based local LLM's have 32k context and hence beat GPT-4 at certain tasks, it's amazing

remsee
Автор

Thanks a million Matthew! Your videos are so clear and easy to follow 🙂Looking forward to your follow-up videos on MemGPT and open source models.

micklavin
Автор

Please let us know and do this again when they have open source models!

tomt
Автор

My favourite open source model is currently _Falcon 180B_ with the web search feature. I was impressed by M$'s _Bing Chat_ in Edge, but I mainly use Falcon instead now, since it seems just as good for grabbing information from the web, at least from my perspective. Although I don't fancy paying to run Falcon on a server, just to test it with MemGPT, despite my eagerness to try it out. It could be interesting if there was a _Falcon 180B_ API, similar to OpenAI's API, only much cheaper.

JTutorials
Автор

This was such a good episode. The fact that the LLMs have memory like humans remembering the first and last, wow. I want this. Great episode!

friendofai
Автор

Once we have a robust way of handling memory, like MemGPT, we can simply fine tune the LLMs to utilize the system. Then we no longer need to use context window space for the system prompt to operate the memory. The LLM will just "naturally" do it.

sveindanielsolvenus
Автор

I recently found this channel and really enjoy the videos. Great job Matt.

tdb
Автор

That video was awesome, very informative. I love how you ACTUALLY read the paper along the video

UnicoAerta
Автор

wow thats amazing! thanks for sharing (you and the researchers as well) !! 🙏🙏🙏

peterwan小P