The END of RAG? Episodic memory for infinite context length

Показать описание

HUMAN-LIKE EPISODIC MEMORY FOR INFINITE CONTEXT LLMS

Support my learning journey either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!

Discuss this stuff with other Tunadorks on Discord

All my other links

Timestamps:
0:00 Video intro
1:10 Paper intro
3:24 Methodology
12:40 Results
16:15 Discussion / Limitations / Future Work
17:30 Outro

Tunadorable

Рекомендации по теме

Комментарии

My course on Human Memory was taught by Michael Kahana, one of the names in the citations that kept popping up. Very interesting to see our in-class temporal contiguity effect demonstration playing out in an AI neuroscience context, wow! Small world in academia :)

OpenSourceAnarchist

In theory it works, but not practically. Systems like these, need to be coupled with thinking tokens, so the most semantically contextual segments are retrieved based on attention, more specifically model reasoning like humans, and instead of relative segment similarity…BUT there are a lot of ideas I took from this part. Like NLL for novel observations and event boundary detection. FYI this is what I used to actually make quiet-star useful, explicitly but autonomously allowing the model to generate useful thoughts, not to mention I use it for the basis for this new style of meta self-supervision I created for the offline token re-weighting phase. So, all and all — pretty amazing ideas in this paper, the value from some of the underlying principles are vastly understated. Great vid bro. No paper is safe lol. I see you meant that ha. Keep them coming bro.

alexanderbrown-dgsy

I personally think that the mechanism behind human episodic memory is far more complicated than this. When humans return to a specific situation, they can instantly recall things that happened decades ago. Does the human brain really store kv caches for decades? I don't believe it.

corgirun

this might solve the 'frame problem' which early more procedural approaches to AI found difficult. Context is all about working out what is important, and an expanding context window would effectively be a solution to the basic problem of working out what IS relevant information in a certain situation.

robertphipps

I tried something that I think is similar to this (without the math part). My idea was to convert conversations into tokens for storage, and when a new prompt would be entered it would look up past events and pull things that matched closely, in theory be memories of related topics based on the token vectors. It didn't work because I don't know enough about the intricacies of tokenization and math (basically wasn't as plug and play as I was hoping for) so I did the next best thing and stored these past conversations as text logs which I then would look through with each prompt to find similar topics. In the end I actually used the LLM to do this analysis search first, then pulled the first few random good matches and incorporated them into the prompting.

Even with the much less effective method it does seem to remember things. I think it only worked because I used an uncensored model that had no limit for input. I was hoping for a different approach to try but as you went through the paper a lot of it felt familiar in the general approach. I do think the token direction would work a lot better and faster, since it's a much better way to compare concepts than textual search.

rhaedas

This sounds pretty simple to implement (at least as this type of paper goes.) It would be really useful when writing narrative text simulations. E.g. ... (history of simulation for all characters up to 10:00) ... "What happens between 10:00 and 10:05 from the perspective of <character 1>", ... "What happens between 10:00 and 10:05 from the perspective of <character 2>" ... "Eliminate contradictions" ...

betube

this sounds more like efficient RAG-like memories, or a RAG-like successor; I suppose it is a kind of episodic memory, but hmm...It's not using this type of memory to be actively in the now per se, well.. eh. Suppose i'm actually looking for a kind of working memory of sorts. eh
i feel like continuity and coherence of intent and of tasks/problem solving should be maintained, not just retrieval of past events from the previous inference, but also that the current inference should have the "why" from a previous infs or some kind of direct "knowledge update" that informs the current inference.

Prob going to be either some kind of autoencoder like memory unit informed inference -like LARIMAR ++ but trained for coherence & continuity over time and for knowledge updates (and storing, and retrieval and proper use thereof for) tasks
or some kind of stateful possibly recurrent complicated system...

True memory esp episodic memory is going to be awesome for agents if/when it happens. An inference that doesn't start over every time... one can dream...

Sirmrmeowmeow

Long term storage of EM-LLM memory segments can probably be managed in a graph structure, similar to vector storage within neo4j graph databases.

A related development is the release of Falcon Mamba 7B. Apparently, increasing the amount of context included in the prompt does not increase the requirement for RAM.

johnkintree

It's odd they didn't try to accumulate the tokens in an episode and chose a single one instead

deltamico

I dont understand anything but does this mean llms will be able to do more things without needing training or specially created vectors to help them understand what we are trying to do, coz i could wait for that

badashphilosophy

So it can recommend paragraph, section and chapter breaks? And from that build an index? Finally, an AI boredom graph.

kimcosmos

this should be needed lots of... visualization... in my mind and i failed to visualize them, so I'm failed to understand this

RickySupriyadi

They should change the title from “infinite” context to “unbounded” context, as “infinite context” implies something physically impossible.

jmarkinman

So, technically, the next gpt could have adhd ? And if so, did we just solved the mathematical form for adhd ?

cinnamonroll

The END of RAG? Episodic memory for infinite context length

The END of RAG? Episodic memory for infinite context length

Mem0 : A framework for developing Long-Term, Personalized Memory

Charan Ranganath: Human Memory, Imagination, Deja Vu, and False Memories | Lex Fridman Podcast #430

AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

Future Trunks Turns Super Saiyan Rage (English Dub) DRAGON BALL SUPER!

Swara and Ragini's Durga Pooja Dance | Swara falls on Laksh

Outlaw Star: The Rebel Tao

Cognitive Psychology - Memory Pt5 - Working Memory

Reliable, fully local RAG agents with LLaMA3.2-3b

Convicting a Murderer Ep 1 - An Unraveling Narrative

Guardians of the Galaxy Walkthrough - Ending - Episode 1: Tangled Up in Blue (Alternative Choices)

“What's wrong with LLMs and what we should be building instead” - Tom Dietterich - #VSCF2023...

Ratchet: Deadlocked Retrospective - Adapt or Die | The Golden Bolt

Why We Love Anne: An Amphibia Analysis

Remembering the Past | The Avengers: Earth's Mightiest Heroes

107 Death Note Facts YOU Should Know | Channel Frederator

CHANNEL ZERO NO END HOUSE E03 30

Full Episodes of Hot Wheels Monster Trucks Camp Crush!

Migraine Prevention: Novel Treatment Strategies for Improved Patient Outcomes

Marvel's Guardians of the Galaxy Walkthrough - Ending - Episode 1: Tangled Up in Blue (Chapter ...

Discussing Booker's '7 Basic Plots' and the Strengths and Weaknesses of Structural Ap...

Extortion for Dummies | Descendants 2

Marvel Guardian of The Galaxy Episodes 1 part 4 ENDING

Tips For Starting a Medical Home Healthcare Agency