Chatbots with RAG: LangChain Full Walkthrough

preview_player
Показать описание
In this video, we work through building a chatbot using Retrieval Augmented Generation (RAG) from start to finish. We use OpenAI's gpt-3.5-turbo Large Language Model (LLM) as the "engine", we implement it with LangChain's ChatOpenAI class, use OpenAI's text-embedding-ada-002 for embedding, and the Pinecone vector database as our knowledge base.

📌 Code:

🌲 Subscribe for Latest Articles and Videos:

👋🏼 AI Consulting:

👾 Discord:

00:00 Chatbots with RAG
00:59 RAG Pipeline
02:35 Hallucinations in LLMs
04:08 LangChain ChatOpenAI Chatbot
09:11 Reducing LLM Hallucinations
13:37 Adding Context to Prompts
17:47 Building the Vector Database
25:14 Adding RAG to Chatbot
28:52 Testing the RAG Chatbot
32:56 Important Notes when using RAG

#artificialintelligence #nlp #ai #langchain #openai #vectordb
Рекомендации по теме
Комментарии
Автор

Excellent tutorial with clear comparison before and after RAG. The Python code walkthrough is useful.

nadranaj
Автор

Nice glad you ran over this I’ve been stuck on part of my RAG but I’m sure your going to clear that up! 🎉thanks

Cdaprod
Автор

very informative video, got understanding of RAG, and LangChain as well, successfully created my very first RAG based chatbot, Learned a lot 👍

ceqwfwk
Автор

Amazing James!
Really Helpful Material

talharauf
Автор

🎯 Key Takeaways for quick navigation:

00:00 🤖 This video demonstrates how to build a chatbot using retrieval augmented generation (RAG) with OpenAI's GPT-3.5 model to answer questions about recent events or internal documentation.
02:58 💡 Language models like GPT-3.5 can sometimes provide inaccurate or hallucinated information because they rely solely on their training data and don't have access to external knowledge.
08:16 📚 Chatbots can use external knowledge or "Source knowledge" to improve responses. This can be done by inserting context or documents into the prompt.
16:09 🔍 Retrieval Augmented Generation (RAG) combines parametric knowledge from the model's training with source knowledge from external documents, allowing the model to access and update information.
19:33 🧠 Pinecone is used as a vector database to store embeddings of external documents for RAG, allowing for efficient retrieval and updating of knowledge.
23:44 🚀 Embedding documents for retrieval in RAG is typically done in batches to avoid overloading the model and network with too many embeddings at once.Is there anything specific you would like to know more about or any questions you have regarding the key takeaways from the video transcript?
24:25 📊 Preparing external documents for retrieval with Pinecone should consider batch size limitations to avoid problems.
26:00 📄 Initializing a Vector database and embedding text chunks for retrieval with Pinecone's Lineage.
27:07 🤖 Using source knowledge from external documents to augment prompts and improve chatbot responses.
28:43 🧠 Augmenting queries with source knowledge for chatbot responses, demonstrating improved answer quality.
31:14 🛡️ Retrieval Augmented Generation (RAG) can significantly enhance the retrieval performance of chatbots by utilizing external knowledge.
32:23 ⏳ Implementing RAG with a simple approach but noting that it may not be suitable for all types of queries.
33:47 📈 Benefits of using RAG include improved retrieval performance, citation of information sources, and faster response times.
34:43 💰 Considerations when using RAG include token usage, cost, and potential performance degradation when feeding too much information into the chatbot.

Made with HARPA AI

niatro
Автор

Great video! Congrats 👏 Excellent speaking capability, keep it up!

francescomiliani
Автор

thanks a lot buddy, keep on the hard work you help me so much!

iddoshemtovv
Автор

Thank you for your videos. They are all awsome. Could you please also talk about how can we extract more granual meta-data from our documents (like page number for example), and how we can force an agent to add the meta-data to the final answer it provides?

raminzandvakili
Автор

Nice video. Any tips for how to handle the "Hi, how are you?" case where you don't want it to do the retrieval part? I'm thinking there must be a way to limit retrieval results based on a similarity score threshold or something. Thanks!

kevinb
Автор

Hi, your video was great . I have a question. What is the advantages to use pinecone over Faiss to store your vectors?

hellkaiser
Автор

Hey, Wonderful Explanation,
Could you please help me understand one thing,
at 30:35, Jupyter Cell No - 36, The response from augmented prompt (about speciality about Llama 2) isn't been appended to the messages. How come in the next run when trying "Safety measures" prompt, model already has some information about the Llama 2 when we haven't actually given the model the chat history??

Taranggpt
Автор

Is there any technique that can be used to verify each sentence of the response with the given context? Except feeding it to the llm as that could have hallucinations and consumes openai tokens

aravindudupa
Автор

I think the initial answers from the AI about string theory were correct. I'm not an expert either, but it seems like they were. Also string theory is largely present in media, popular culture... so it would make sense for the AI to be well informed about it :)

elenakusevska
Автор

would this work with other LLMs from huggingface ? and other embeddings like Chroma ?
if so, what LLM would you recommend ?

mahmoudqahawish
Автор

Thx for the video :) can you update your vector database by a few lines ( if you want to add data to your knowledge base) automatically by running a python script or something like that?

da-bbup
Автор

Question : once loading a vector store, how can we output a dataset from the store to be used as a fine tuning object ?

xspydazx
Автор

would it be possible to use some form of RAG pipeline to help improve an LLM's abliity to classify things that are in a niche domain? eg let's say we have 50 unnusual domains (that weren't in the LLM's training data, or at least, little exposure). We create a vector store where we have 50 labels with a few thousand examples of how each label is expressed. Would that work?

therealsamho
Автор

SystemMessage, AIMessage, would these work for LLAMA2 too? Since it's syntax requires [INST] and [/INST]

yuktier
Автор

Great walkthrough as always. My auestion would be is it beter to use LLM chain like conversational chain or chatbot is better for RAG? also RAG Agents, althoug they run on huge amaount of tokens.

andriusem
Автор

Very helpful video. Is this doable with Llama-2 instead of using ChatOpenAI() function from OpenAI? Basically I am trying to do the same with Llama instead of using OpenAI's GPT model

dipayansarkar
join shbcf.ru