Program a RAG LLM Chat App with LangChain, Streamlit, OpenAI and Anthropic APIs

preview_player
Показать описание
Learn how to build and deploy a RAG web application using Python, Streamlit and LangChain, so multiple users can chat with Documents, Websites and other custom data online.

In this RAG LLM course, we will learn how to develop a Retrieval Augmented Generation (RAG) pipeline step by step, and how to integrate it into a Chat Web App, in Python, using LangChain and Streamlit.

As you probably already know, LLMs are trained on large amounts of public data up to a certain date. Any fact that is either not public, newer, or quite niche is essentially unknown to them. Although newer models tend to be better at recalling facts that were in the training set, they are still far from perfect. This can be a limiting factor for many tasks that, for one reason or another, require an LLM that has to know specific topics very precisely.

RAG consists of providing a source of custom information to our LLM chat pipeline. Before sending any question to the model, we automatically provide the most relevant fragments of context extracted from this database, so the model has precise details in the context itself next to our question. In this way, the model knows very precisely what we are talking about, where the information comes from, and we can easily update that information with almost no cost or need for a GPU. We can use any already available LLM, like GPT-4o from the OpenAI API (now or soon even o1 and o1-mini!), Claude 3.5 from the Anthropic API, or even open-source ones with the original weights in a cheap and efficient way as we are already doing. If a better model appears tomorrow, we can integrate it almost immediately into our RAG pipeline and take advantage of it without having to fine-tune any LLM again.

In summary, this is an AI coding tutorial on how to use the LangChain chains create_history_aware_retriever and create_retrieval_chain, and also create_stuff_documents_chain to retrieve data from a Chorma DB Vector Store, where we would have stored our custom data embeddings using the OpenAI Embeddings model. This data would have been loaded with some different LangChain document loaders, and splitted using RecursiveCharacterTextSplitter. What's more, you will see how to make use of the OpenAI API and the Anthropic API, to make requests and get answers from their Large Language Models.

💡 Make sure to follow me on Medium, YouTube and GitHub as in the next blog and video we will see how to deploy this app into Azure, using GPT-4o and GPT-4o mini through Azure OpenAI Service and adding SSO Authentication in front of our app, so only authorized users under our Azure subscription (for example, your work colleagues) can access to our app, no one else will spend our resources or steal our data!

Sections:
00:00 - Intro
2:08 - What is RAG and why it's better than Fine Tuning
7:34 - RAG in Python with LangChain step by step
19:00 - Integrating RAG into an LLM Chat web app
37:20 - Deploy the RAG web app online for free!

Subscribe to see more AI and ML programming related content! 🚀🚀

-------------------------------------------------------------

#gpt #gpt4o #gpt4 #openai #promptengineering #langchain #o1 #openaio1 #chatgpto1 #openaistrawbery #chatgpt #openaiapi #python #streamlit #github #cloud #portfolio #agent #gpt #aiagents #automation #ai #streamlit #llm #copilot #chatgpt4o #omnichat #omnidata #howtochatgpt #github #git #vscode #gui #pythongui #stream #modelstream #streaming #llmstream #llmstreaming #openaistream #openaistreaming #rag #retrievalaugmentedgeneration #langchainclaude #anthropiclangchain #llamaindex #ollama #llamacpp
Рекомендации по теме
Комментарии
Автор

Thank you for the detailed explanation. do you have any content on how to perform load testing for the Streamlit chat application?

joytheultimate
Автор

getting error creating vector_db says "The onnxruntime python package is not installed." But its already installed. I'm using python3.11

MdSaifuddinShaikh
Автор

Looks great. When loading a PDF I got "Error loading document sample.pdf: cryptography>=3.1 is required for AES algorithm".

Naejbert
Автор

Thank You Very much for explaining very clearly. For me App worked locally, and I am using python 3.13 . To deploy to streamlit cloud i am facing few issues with version. do we need python version lesser than 3.11 to deploy to streamlit.

SivaK-pf
Автор

Could you please make a video on how to upload a vector store database to the cloud for reuse in the future?

phonglehoang
Автор

Is there any way to do it without giving my credit card to open AI? In the first try I got the "You exceeded your current quota" message

nicobonder
Автор

Is there a reason why we're not using Google's Gemini?

손현수-co
Автор

Can you make with Google gemini models or other models this are paid models.

sarveshudapurkar
Автор

We are waiting continues update in videos and blog❤.

sitheekmohamedarsath