Chatbot Memory for Chat-GPT, Davinci + other LLMs - LangChain #4

Показать описание

Conversational memory is how a chatbot can respond to multiple queries in a chat-like manner. It enables a coherent conversation, and without it, every query would be treated as an entirely independent input without considering past interactions.

The memory allows a Large Language Model (LLM) to remember previous interactions with the user. By default, LLMs are *stateless* — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.

There are many applications where remembering previous interactions is very important, such as chatbots. Conversational memory allows us to do that.

There are several ways that we can implement conversational memory. In the context of LangChain, they are all built on top of the `ConversationChain`.

🌲 Pinecone article:

📌 LangChain Handbook Code:

🙋🏽‍♂️ Francisco:

🎙️ AI Dev Studio:

🎉 Subscribe for Article and Video Updates!

👾 Discord:

00:00 Conversational memory for chatbots
00:28 Why we need conversational memory for chatbots
01:45 Implementation of conversational memory
04:05 LangChain's Conversation Chain
12:00 Conversation Summary Memory in LangChain
19:06 Conversation Buffer Window Memory in LangChain
21:35 Conversation Summary Buffer Memory in LangChain
24:33 Other LangChain Memory Types
25:25 Final thoughts on conversational memory

#artificialintelligence #nlp #openai #deeplearning #langchain

Рекомендации по теме

Комментарии

Thanks Kames to elaborate about Langchain Memory, For the viewers here are some 🎯 Key Takeaways for quick navigation:

00:00 🧠 Conversational memory is essential for chatbots and AI agents to respond coherently to queries in a conversation.
01:23 📚 Different memory types, like conversational buffer memory and conversational summary memory, help manage and recall previous interactions in chatbots.
05:42 🔄 Conversational buffer memory stores all past interactions in a chat, while conversational summary memory summarizes these interactions, reducing token usage.
14:13 🪟 Conversational buffer window memory limits the number of recent interactions saved, offering a balance between token usage and remembering recent interactions.
23:05 📊 Conversational summary buffer memory combines summarization and saving recent interactions, providing flexibility in managing conversation history.

We are also doing lots of workshops in this pace, looking forward to talk more

decodingdatascience

Super! For me, it is one of the best tutorials on this subject. Much appreciated, James.

MrFiveDirections

Things really seem to get interesting with the knowledge graph! Saving things that really matter like relation context, along with a combination of the other methods, starts to sound very powerful. Add in some embedding/vectorDB and wow. The other commenters idea about a system for bots evolving sentiment, or even personality, over time is worth thinking about as well.

daharius

Thank you. I was way behind langchain and had no time to read documentations. This video saved me a lot of time. Subscribed.

GergelyGyurics

Another masterpiece of a tutorial. You’re an absolute gem James!

kevon

Oh wow you just destroyed my project lol I gave chat GPT long term memory, autonomous memory store and recall, speech recognition, audio out put, self reflect. Thought I was the only working on stuff like this. Well I’m basically trying to build a sentient, I need vision tho. Hopefully GPT 4 is multimodal because I’m struggling to give me project vision recognition.

NextGenart

Cool! This video addressed the question that I had posed in your earlier (1st) video about the Token size limitations due to adding conversational history. The charts provide a good intuition of the workings of the memory types. Two takeaways. 1.When to use which mem. type 2. How to do performance tuning for a Chatbot app. due to the overheads posed by token tracking, memory appending so on..

cloudshoring

If I understand correctly the graphs, what is represented is the token used per interaction, in the case of the Buffer Memory (the quasi linear one), the 25th interact is about 4k tokens. But the price (in tokens) of the whole conversation up to the 25th interaction is the sum of the price of all the interactions up to the 25th. So basically the price of the conversations, in each case, is the area under the curves you showed, not the highest point it reached. The Summarized conversations, with the flat tendency towards the end, it means the price just keep adding almost the same tokens per each new interaction, not that the price of the conversation has reached a top.

adumont

Great explaining to the memory in langchain, when you show the chart is more clearly for my

davidmoran

Skimming through the docs, LangChain seems like a complicated abstraction around what's essentially auto copy and paste.

jason_v

Check out David Shapiro’s latest approach with salient summarization when you get a chance. Essentially: The summarizer can more efficiently pick and choose which context to preserve if it is properly primed with specific objectives/goals for the information.

THCV

Thanks for your content! looking forward to watching the knowledge graph video :)

DavidGarcia-gdvq

Great video! I love the graphs for token usage. I kept meaning to graph the trends myself, but I was too lazy! I was talking to Harrison Chase as he was implementing the latest changes to memory, and it's had me thinking about other unique ways to approach it. I've been using different customized summarizers, and I can bring up any subset of the message history as I like, but I'm thinking also to include some way to flag messages as important or unimportant, dynamically feeding the history. I also haven't really explored my options in terms of local storage and retrieval of old chat history. One note that I might make for the video too... I noticed you're using LangChain's usual OpenAI class and just adjusting your model to 3.5-turbo. My understanding is that we have been advised to use the new ChatOpenAI class for now when interacting with 3.5-turbo, since that's where they'll be focusing development and they can address changes there without breaking other stuff, necessary since the new model endpoint differs in how it takes a message list as parameter instead of a simple string.

m.branson

James - are you still planning to work on the KG video? Seems like a powerful method that solves for scale and token limits.

gutgutia

Thanks for this content James, awesome!

matheusrdgsf

In the scenario of conversational robots, how to limit the token consumption of the entire conversation?

For example, once the consumption reaches 1, 000, it will prompt that the tokens for this conversation have been used up.

FCrobot

Thank you! Awesome work!! Appreaciate it!

Davipar

Great content. thanks for that.

I'm working on a summary tweets use case, but I don't want to break the overall corpus into pieces, build summary to each one, and combine those summaries into a larger one. I want something more clever.

Suppose I have 10 tweets. 6 are related (same topics) and the last 4 are different from each other. I think I can build a better summary from "lang chain summary" by only summarizing the 6 related tweets and adding the 4 raw tweets. This can help not to lose the context for the future.

isaacyimgaingkuissu

Hi James, great video. This is probably a stupid comment but here goes.…Could you not just ask the LLM to capture some key variables that summarise the completion for the prompt and then feed that (rather than the full conversation) as ‘memory’ for subsequent prompts? I’m imagining a ‘ghost’ question being added to each prompt like ‘Also capture key variables to summarise the response for future recall’ and then this being used as the assistant message (per GTPTurbo 3.5) rather than all of the previous conversation?

bwilliams

Hi Sam, how do we keep the Conversation context of multiple users on different devices separate ?

sysadmin

Chatbot Memory for Chat-GPT, Davinci + other LLMs - LangChain #4

Chatbot Memory for Chat-GPT, Davinci + other LLMs - LangChain #4

How to Build a CHAT BOT with ChatGPT API (GPT-3.5-TURBO) having CONVERSATIONAL MEMORY in Python

LangChain - Conversations with Memory (explanation & code walkthrough)

Build an Agent with Long-Term, Personalized Memory

Build ChatGPT Chatbots with LangChain Memory: Understanding and Implementing Memory in Conversations

Increase ChatGPT Memory Up to 10 Times More !!!

LangChain: Giving Memory to LLMs

ChatGPT Now Has Working Memory Like Humans 😱

ChatGPT API. How is it different from DaVinci?

LangChain - Conversational Memory

Langchain Gpt 3 Chat bot with memory. AI tutorial: 6 different memory types coded and explained

Langchain Memory Model | How can LLM AI hold a ChatGPT-like conversation?

Creating Chatbot with Chatgpt and Python 🤖🤖🤖 Generative AI 🔥 text-davinci 🔥Tutorial🔥...

Chatbot with Memory using Langchain and Streamlit

Build a PERSONAL CHATBOT with LangChainAI MEMORY with ChatGPT-3.5-Turbo API in PYTHON

How to use ChatGPT Memory?

Elon Musk Protected by his Humanoid Robot Bodyguard Eyes

Chatbots and Long-Term Memory Explained

Chatbot with INFINITE MEMORY using OpenAI & Pinecone - GPT-3, Embeddings, ADA, Vector DB, Semant...

ChatGPT Memory - What Does ChatGPT Remember?

Add Long-Term Chat Memory 💭 to LangChain Applications

How to add memory to your LLM to remember previous conversation. #llm #ai #chatgpt #datascience

LangChain: How to Build a Multi-Document QA Chatbot with Agent Memory

The Easiest Way To Add Memory To Your Chatbot (Pinecone Tutorial)