Chatbots with RAG: LangChain Full Walkthrough

Показать описание

In this video, we work through building a chatbot using Retrieval Augmented Generation (RAG) from start to finish. We use OpenAI's gpt-3.5-turbo Large Language Model (LLM) as the "engine", we implement it with LangChain's ChatOpenAI class, use OpenAI's text-embedding-ada-002 for embedding, and the Pinecone vector database as our knowledge base.

📌 Code:

🌲 Subscribe for Latest Articles and Videos:

👋🏼 AI Consulting:

👾 Discord:

00:00 Chatbots with RAG
00:59 RAG Pipeline
02:35 Hallucinations in LLMs
04:08 LangChain ChatOpenAI Chatbot
09:11 Reducing LLM Hallucinations
13:37 Adding Context to Prompts
17:47 Building the Vector Database
25:14 Adding RAG to Chatbot
28:52 Testing the RAG Chatbot
32:56 Important Notes when using RAG

#artificialintelligence #nlp #ai #langchain #openai #vectordb

Рекомендации по теме

Комментарии

Excellent tutorial with clear comparison before and after RAG. The Python code walkthrough is useful.

nadranaj

Nice glad you ran over this I’ve been stuck on part of my RAG but I’m sure your going to clear that up! 🎉thanks

Cdaprod

very informative video, got understanding of RAG, and LangChain as well, successfully created my very first RAG based chatbot, Learned a lot 👍

ceqwfwk

Amazing James!
Really Helpful Material

talharauf

🎯 Key Takeaways for quick navigation:

00:00 🤖 This video demonstrates how to build a chatbot using retrieval augmented generation (RAG) with OpenAI's GPT-3.5 model to answer questions about recent events or internal documentation.
02:58 💡 Language models like GPT-3.5 can sometimes provide inaccurate or hallucinated information because they rely solely on their training data and don't have access to external knowledge.
08:16 📚 Chatbots can use external knowledge or "Source knowledge" to improve responses. This can be done by inserting context or documents into the prompt.
16:09 🔍 Retrieval Augmented Generation (RAG) combines parametric knowledge from the model's training with source knowledge from external documents, allowing the model to access and update information.
19:33 🧠 Pinecone is used as a vector database to store embeddings of external documents for RAG, allowing for efficient retrieval and updating of knowledge.
23:44 🚀 Embedding documents for retrieval in RAG is typically done in batches to avoid overloading the model and network with too many embeddings at once.Is there anything specific you would like to know more about or any questions you have regarding the key takeaways from the video transcript?
24:25 📊 Preparing external documents for retrieval with Pinecone should consider batch size limitations to avoid problems.
26:00 📄 Initializing a Vector database and embedding text chunks for retrieval with Pinecone's Lineage.
27:07 🤖 Using source knowledge from external documents to augment prompts and improve chatbot responses.
28:43 🧠 Augmenting queries with source knowledge for chatbot responses, demonstrating improved answer quality.
31:14 🛡️ Retrieval Augmented Generation (RAG) can significantly enhance the retrieval performance of chatbots by utilizing external knowledge.
32:23 ⏳ Implementing RAG with a simple approach but noting that it may not be suitable for all types of queries.
33:47 📈 Benefits of using RAG include improved retrieval performance, citation of information sources, and faster response times.
34:43 💰 Considerations when using RAG include token usage, cost, and potential performance degradation when feeding too much information into the chatbot.

Made with HARPA AI

niatro

Great video! Congrats 👏 Excellent speaking capability, keep it up!

francescomiliani

thanks a lot buddy, keep on the hard work you help me so much!

iddoshemtovv

Thank you for your videos. They are all awsome. Could you please also talk about how can we extract more granual meta-data from our documents (like page number for example), and how we can force an agent to add the meta-data to the final answer it provides?

raminzandvakili

Nice video. Any tips for how to handle the "Hi, how are you?" case where you don't want it to do the retrieval part? I'm thinking there must be a way to limit retrieval results based on a similarity score threshold or something. Thanks!

kevinb

Hi, your video was great . I have a question. What is the advantages to use pinecone over Faiss to store your vectors?

hellkaiser

Hey, Wonderful Explanation,
Could you please help me understand one thing,
at 30:35, Jupyter Cell No - 36, The response from augmented prompt (about speciality about Llama 2) isn't been appended to the messages. How come in the next run when trying "Safety measures" prompt, model already has some information about the Llama 2 when we haven't actually given the model the chat history??

Taranggpt

Is there any technique that can be used to verify each sentence of the response with the given context? Except feeding it to the llm as that could have hallucinations and consumes openai tokens

aravindudupa

I think the initial answers from the AI about string theory were correct. I'm not an expert either, but it seems like they were. Also string theory is largely present in media, popular culture... so it would make sense for the AI to be well informed about it :)

elenakusevska

would this work with other LLMs from huggingface ? and other embeddings like Chroma ?
if so, what LLM would you recommend ?

mahmoudqahawish

Thx for the video :) can you update your vector database by a few lines ( if you want to add data to your knowledge base) automatically by running a python script or something like that?

da-bbup

Question : once loading a vector store, how can we output a dataset from the store to be used as a fine tuning object ?

xspydazx

would it be possible to use some form of RAG pipeline to help improve an LLM's abliity to classify things that are in a niche domain? eg let's say we have 50 unnusual domains (that weren't in the LLM's training data, or at least, little exposure). We create a vector store where we have 50 labels with a few thousand examples of how each label is expressed. Would that work?

therealsamho

SystemMessage, AIMessage, would these work for LLAMA2 too? Since it's syntax requires [INST] and [/INST]

yuktier

Great walkthrough as always. My auestion would be is it beter to use LLM chain like conversational chain or chatbot is better for RAG? also RAG Agents, althoug they run on huge amaount of tokens.

andriusem

Very helpful video. Is this doable with Llama-2 instead of using ChatOpenAI() function from OpenAI? Basically I am trying to do the same with Llama instead of using OpenAI's GPT model

dipayansarkar

Chatbots with RAG: LangChain Full Walkthrough

Chatbots with RAG: LangChain Full Walkthrough

Chatbot with RAG, using LangChain, OpenAI, and Groq

RAG + Langchain Python Project: Easy AI/Chat For Your Docs

Retrieval-Augmented Generation chatbot, part 1: LangChain, Hugging Face, FAISS, AWS

Chatbot Answering from Your Own Knowledge Base: Langchain, ChatGPT, Pinecone, and Streamlit: | Code

Chatbot Memory: Retrieval Augmented Generation (RAG) Chain | LangChain | Python | Ask PDF Documents

LangChain - Conversations with Memory (explanation & code walkthrough)

RAG-GPT: Chat with any documents and summarize long PDF files with Langchain | Gradio App

GenAI & LLMs | Video9 |Part2 | Mastering Chatbots & Memory with Langchain | Venkat Reddy AI...

Build a Large Language Model AI Chatbot using Retrieval Augmented Generation

Building a Chatgpt like Chatbot using Langchain and Hugging Face || Step by step Langchain tutorial

CHATGPT For WEBSITES: Custom ChatBOT: LangChain Tutorial

2-Langchain Series-Building Chatbot Using Paid And Open Source LLM's using Langchain And Ollama

Build an AI RAG Application with LangChain & Next.js

Build Chatbots with Memory using Langchain and HuggingFace || ChatPDF with Memory

Learn LangChain In 1 Hour With End To End LLM Project With Deployment In Huggingface Spaces

Chatbot Memory for Chat-GPT, Davinci + other LLMs - LangChain #4

Open Source RAG Chatbot with Gemma and Langchain | (Deploy LLM on-prem)

Build a Chatbot with Next.js, LangChain, OpenAI, and Supabase Vector

Full-stack AI chatbot with custom data with Next.js, Langchain, Pinecone and OpenAI

Build your own RAG (retrieval augmented generation) AI Chatbot using Python | Simple walkthrough

Build an LLM powered CHATBOT with RAG and LangChain in few seconds - LangChain series

Build a Perplexity Style RAG App with Langchain in Next.JS and Supabase Realtime

Tutorial | Chat with any Website using Python and Langchain (LATEST VERSION)