Program a RAG LLM Chat App with LangChain, Streamlit, OpenAI and Anthropic APIs

Показать описание

Learn how to build and deploy a RAG web application using Python, Streamlit and LangChain, so multiple users can chat with Documents, Websites and other custom data online.

In this RAG LLM course, we will learn how to develop a Retrieval Augmented Generation (RAG) pipeline step by step, and how to integrate it into a Chat Web App, in Python, using LangChain and Streamlit.

As you probably already know, LLMs are trained on large amounts of public data up to a certain date. Any fact that is either not public, newer, or quite niche is essentially unknown to them. Although newer models tend to be better at recalling facts that were in the training set, they are still far from perfect. This can be a limiting factor for many tasks that, for one reason or another, require an LLM that has to know specific topics very precisely.

RAG consists of providing a source of custom information to our LLM chat pipeline. Before sending any question to the model, we automatically provide the most relevant fragments of context extracted from this database, so the model has precise details in the context itself next to our question. In this way, the model knows very precisely what we are talking about, where the information comes from, and we can easily update that information with almost no cost or need for a GPU. We can use any already available LLM, like GPT-4o from the OpenAI API (now or soon even o1 and o1-mini!), Claude 3.5 from the Anthropic API, or even open-source ones with the original weights in a cheap and efficient way as we are already doing. If a better model appears tomorrow, we can integrate it almost immediately into our RAG pipeline and take advantage of it without having to fine-tune any LLM again.

In summary, this is an AI coding tutorial on how to use the LangChain chains create_history_aware_retriever and create_retrieval_chain, and also create_stuff_documents_chain to retrieve data from a Chorma DB Vector Store, where we would have stored our custom data embeddings using the OpenAI Embeddings model. This data would have been loaded with some different LangChain document loaders, and splitted using RecursiveCharacterTextSplitter. What's more, you will see how to make use of the OpenAI API and the Anthropic API, to make requests and get answers from their Large Language Models.

💡 Make sure to follow me on Medium, YouTube and GitHub as in the next blog and video we will see how to deploy this app into Azure, using GPT-4o and GPT-4o mini through Azure OpenAI Service and adding SSO Authentication in front of our app, so only authorized users under our Azure subscription (for example, your work colleagues) can access to our app, no one else will spend our resources or steal our data!

Sections:
00:00 - Intro
2:08 - What is RAG and why it's better than Fine Tuning
7:34 - RAG in Python with LangChain step by step
19:00 - Integrating RAG into an LLM Chat web app
37:20 - Deploy the RAG web app online for free!

Subscribe to see more AI and ML programming related content! 🚀🚀

-------------------------------------------------------------

#gpt #gpt4o #gpt4 #openai #promptengineering #langchain #o1 #openaio1 #chatgpto1 #openaistrawbery #chatgpt #openaiapi #python #streamlit #github #cloud #portfolio #agent #gpt #aiagents #automation #ai #streamlit #llm #copilot #chatgpt4o #omnichat #omnidata #howtochatgpt #github #git #vscode #gui #pythongui #stream #modelstream #streaming #llmstream #llmstreaming #openaistream #openaistreaming #rag #retrievalaugmentedgeneration #langchainclaude #anthropiclangchain #llamaindex #ollama #llamacpp

Рекомендации по теме

Комментарии

Thank you for the detailed explanation. do you have any content on how to perform load testing for the Streamlit chat application?

joytheultimate

getting error creating vector_db says "The onnxruntime python package is not installed." But its already installed. I'm using python3.11

MdSaifuddinShaikh

Looks great. When loading a PDF I got "Error loading document sample.pdf: cryptography>=3.1 is required for AES algorithm".

Naejbert

Thank You Very much for explaining very clearly. For me App worked locally, and I am using python 3.13 . To deploy to streamlit cloud i am facing few issues with version. do we need python version lesser than 3.11 to deploy to streamlit.

SivaK-pf

Could you please make a video on how to upload a vector store database to the cloud for reuse in the future?

phonglehoang

Is there any way to do it without giving my credit card to open AI? In the first try I got the "You exceeded your current quota" message

nicobonder

Is there a reason why we're not using Google's Gemini?

손현수-co

Can you make with Google gemini models or other models this are paid models.

sarveshudapurkar

We are waiting continues update in videos and blog❤.

sitheekmohamedarsath

Program a RAG LLM Chat App with LangChain, Streamlit, OpenAI and Anthropic APIs

Program a RAG LLM Chat App with LangChain, Streamlit, OpenAI and Anthropic APIs

What is Retrieval-Augmented Generation (RAG)?

Build a RAG Based LLM App in 20 Minutes! | Full Langflow Tutorial

Build a Large Language Model AI Chatbot using Retrieval Augmented Generation

Building a RAG Based LLM App And Deploying It In 20 Minutes

RAG + Langchain Python Project: Easy AI/Chat For Your Docs

Build a Retrieval-Augmented Generation Chatbot in 5 Minutes

Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer

Oobabooga | Superbooga RAG function for LLM

Chatbots with RAG: LangChain Full Walkthrough

How Does Rag Work? - Vector Database and LLMs #datascience #naturallanguageprocessing #llm #gpt

Chat with SQL and Tabular Databases using LLM Agents (DON'T USE RAG!)

Step-by-Step Guide to Building a RAG LLM App with LLamA2 and LLaMAindex

'I want Llama3 to perform 10x with my private knowledge' - Local Agentic RAG w/ llama3

Open Source RAG with Gemma and Langchain | (Deploy LLM on-prem)

How to build a custom React chat application using a LLM (RAG and Langchain)

Local GraphRAG + Langchain + local llm = Easy AI/Chat for your Docs

VideoChad 🗿 : CHAT with Any Youtube Video ! | LLM + RAG + Vector DB

Retrieval-Augmented Generation (RAG) in 60 seconds 🕰️ #ai #artificialintelligence #learning #llm...

How to evaluate an LLM-powered RAG application automatically.

What is Agentic RAG?

RAG Explained

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

Prompt Engineering Tutorial – Master ChatGPT and LLM Responses