The END of RAG? Episodic memory for infinite context length

preview_player
Показать описание
HUMAN-LIKE EPISODIC MEMORY FOR INFINITE CONTEXT LLMS

Support my learning journey either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!

Discuss this stuff with other Tunadorks on Discord

All my other links

Timestamps:
0:00 Video intro
1:10 Paper intro
3:24 Methodology
12:40 Results
16:15 Discussion / Limitations / Future Work
17:30 Outro
Рекомендации по теме
Комментарии
Автор

My course on Human Memory was taught by Michael Kahana, one of the names in the citations that kept popping up. Very interesting to see our in-class temporal contiguity effect demonstration playing out in an AI neuroscience context, wow! Small world in academia :)

OpenSourceAnarchist
Автор

In theory it works, but not practically. Systems like these, need to be coupled with thinking tokens, so the most semantically contextual segments are retrieved based on attention, more specifically model reasoning like humans, and instead of relative segment similarity…BUT there are a lot of ideas I took from this part. Like NLL for novel observations and event boundary detection. FYI this is what I used to actually make quiet-star useful, explicitly but autonomously allowing the model to generate useful thoughts, not to mention I use it for the basis for this new style of meta self-supervision I created for the offline token re-weighting phase. So, all and all — pretty amazing ideas in this paper, the value from some of the underlying principles are vastly understated. Great vid bro. No paper is safe lol. I see you meant that ha. Keep them coming bro.

alexanderbrown-dgsy
Автор

I personally think that the mechanism behind human episodic memory is far more complicated than this. When humans return to a specific situation, they can instantly recall things that happened decades ago. Does the human brain really store kv caches for decades? I don't believe it.

corgirun
Автор

this might solve the 'frame problem' which early more procedural approaches to AI found difficult. Context is all about working out what is important, and an expanding context window would effectively be a solution to the basic problem of working out what IS relevant information in a certain situation.

robertphipps
Автор

I tried something that I think is similar to this (without the math part). My idea was to convert conversations into tokens for storage, and when a new prompt would be entered it would look up past events and pull things that matched closely, in theory be memories of related topics based on the token vectors. It didn't work because I don't know enough about the intricacies of tokenization and math (basically wasn't as plug and play as I was hoping for) so I did the next best thing and stored these past conversations as text logs which I then would look through with each prompt to find similar topics. In the end I actually used the LLM to do this analysis search first, then pulled the first few random good matches and incorporated them into the prompting.

Even with the much less effective method it does seem to remember things. I think it only worked because I used an uncensored model that had no limit for input. I was hoping for a different approach to try but as you went through the paper a lot of it felt familiar in the general approach. I do think the token direction would work a lot better and faster, since it's a much better way to compare concepts than textual search.

rhaedas
Автор

This sounds pretty simple to implement (at least as this type of paper goes.) It would be really useful when writing narrative text simulations. E.g. ... (history of simulation for all characters up to 10:00) ... "What happens between 10:00 and 10:05 from the perspective of <character 1>", ... "What happens between 10:00 and 10:05 from the perspective of <character 2>" ... "Eliminate contradictions" ...

betube
Автор

this sounds more like efficient RAG-like memories, or a RAG-like successor; I suppose it is a kind of episodic memory, but hmm...It's not using this type of memory to be actively in the now per se, well.. eh. Suppose i'm actually looking for a kind of working memory of sorts. eh
i feel like continuity and coherence of intent and of tasks/problem solving should be maintained, not just retrieval of past events from the previous inference, but also that the current inference should have the "why" from a previous infs or some kind of direct "knowledge update" that informs the current inference.

Prob going to be either some kind of autoencoder like memory unit informed inference -like LARIMAR ++ but trained for coherence & continuity over time and for knowledge updates (and storing, and retrieval and proper use thereof for) tasks
or some kind of stateful possibly recurrent complicated system...

True memory esp episodic memory is going to be awesome for agents if/when it happens. An inference that doesn't start over every time... one can dream...

Sirmrmeowmeow
Автор

Long term storage of EM-LLM memory segments can probably be managed in a graph structure, similar to vector storage within neo4j graph databases.

A related development is the release of Falcon Mamba 7B. Apparently, increasing the amount of context included in the prompt does not increase the requirement for RAM.

johnkintree
Автор

It's odd they didn't try to accumulate the tokens in an episode and chose a single one instead

deltamico
Автор

I dont understand anything but does this mean llms will be able to do more things without needing training or specially created vectors to help them understand what we are trying to do, coz i could wait for that

badashphilosophy
Автор

So it can recommend paragraph, section and chapter breaks? And from that build an index? Finally, an AI boredom graph.

kimcosmos
Автор

this should be needed lots of... visualization... in my mind and i failed to visualize them, so I'm failed to understand this

RickySupriyadi
Автор

They should change the title from “infinite” context to “unbounded” context, as “infinite context” implies something physically impossible.

jmarkinman
Автор

So, technically, the next gpt could have adhd ? And if so, did we just solved the mathematical form for adhd ?

cinnamonroll