Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer

preview_player
Показать описание
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer. This Python course teaches you how to use RAG to combine your own custom data with the power of Large Language Models (LLMs).

✏️ Course created by Lance Martin, PhD.

⭐️ Course Contents ⭐️
⌨️ (0:00:00) Overview
⌨️ (0:05:53) Indexing
⌨️ (0:10:40) Retrieval
⌨️ (0:15:52) Generation
⌨️ (0:22:14) Query Translation (Multi-Query)
⌨️ (0:28:20) Query Translation (RAG Fusion)
⌨️ (0:33:57) Query Translation (Decomposition)
⌨️ (0:40:31) Query Translation (Step Back)
⌨️ (0:47:24) Query Translation (HyDE)
⌨️ (0:52:07) Routing
⌨️ (0:59:08) Query Construction
⌨️ (1:05:05) Indexing (Multi Representation)
⌨️ (1:11:39) Indexing (RAPTOR)
⌨️ (1:19:19) Indexing (ColBERT)
⌨️ (1:26:32) CRAG
⌨️ (1:44:09) Adaptive RAG
⌨️ (2:12:02) The future of RAG

🎉 Thanks to our Champion and Sponsor supporters:
👾 davthecoder
👾 jedi-or-sith
👾 南宮千影
👾 Agustín Kussrow
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Justin Hual
👾 Otis Morgan
👾 Oscar Rahnama

--

Рекомендации по теме
Комментарии
Автор

🎯 Key points for quick navigation:

00:00 *📚 Introduction to RAG by Lance Martin, a LangChain engineer.*
00:14 *💡 Explanation of how RAG combines custom data with LLMs.*
00:28 *🔍 Motivation: Most data is private, but LLMs are trained on public data.*
01:08 *🗃️ Context windows in LLMs are growing, allowing more private data to be input.*
01:48 *⚙️ Overview of RAG: Indexing, retrieval, and generation stages.*
02:54 *📊 RAG unites LLMs' processing with large-scale private data.*
03:24 *🧠 Breakdown of RAG components: Query translation, routing, construction, and more.*
04:46 *⭐ Methods for document retrieval and reranking in RAG.*
05:55 *💾 Indexing external documents and converting them to numerical representations.*
08:25 *🧩 Splitting documents for embedding due to context window limits.*
10:00 *🖥️ Computing fixed-length vectors for documents using embeddings.*
12:45 *🔍 Using k-nearest neighbors to find similar documents.*
15:59 *📝 Generating answers based on retrieved documents in RAG.*
17:07 *📝 Prompt templates for generating answers in LLMs.*
19:02 *🔗 Combining prompts, LLMs, and retrievers into chains.*
22:14 *🚀 Introduction to advanced query translation in RAG.*
23:07 *✔️ Importance of rewriting queries for effective retrieval.*
24:05 *🌐 Multi-query approach: Rewriting questions from different perspectives.*
25:38 *🚀 Indexed a blog post on agents in a vector store.*
26:19 *🔍 Split question into sub-questions and retrieve relevant documents.*
28:08 *🔧 Used LangSmith to trace intermediate and final steps.*
30:42 *🗂️ Built a consolidated list from multiple retrievals.*
35:02 *🧩 Discussed sub-question decomposition retrieval.*
36:23 *🔄 Combined answers to iterative sub-questions for final answer.*
38:18 *🔗 Connected question-answer pairs sequentially in prompts.*
41:02 *📚 Stepback prompting for generating more abstract questions.*
43:02 *🪜 Generated more generic questions to enhance context for retrieval.*
44:45 *🔄 Retrieval performed on both original and stepback questions.*
48:50 *🌐 HYDE involves converting questions into hypothetical documents for better alignment with document embeddings.*
49:43 *🔎 Generated hypothetical documents based on questions for more effective retrieval.*
51:15 *📝 Hypothetical Document: Demonstrated hypothetical document generation and retrieval process.*
51:44 *🌟 Performance: Using hypothetical document generation can improve retrieval performance.*
52:13 *🚦 Routing: Involves translating a query and routing it to appropriate data sources.*
53:48 *🔍 Semantic Routing: Embeds and compares questions to prompts for routing.*
56:08 *🔗 Routing Mechanism: Connects the intended data source to specific retriever chains.*
58:11 *🚀 Semantic Routing Example: Demonstrates choosing a prompt based on semantic similarity.*
59:47 *💬 Query Construction: Transforms natural language queries to structured queries for metadata filters.*
01:00:15 *🗓️ Example Query: Converts natural questions into structured queries with date filters and metadata.*
01:04:26 *📚 Query Optimization: Optimizes retrieval by translating natural language into data-querying domain-specific languages.*
01:11:48 *🗄️ Hierarchical Indexing: Raptor technique deals with questions needing detailed and broader information.*
01:12:57 *🧩 Hierarchical indexing helps in retrieving more relevant document chunks by clustering and summarizing documents recursively.*
01:14:08 *🤏 Summaries provide high-level semantic representations, while raw chunks offer detailed, document-specific insights.*
01:15:04 *🧪 Comprehensive studies indicate that hierarchical indexing enhances semantic search by offering better coverage across different question types.*
01:17:19 *📇 Process involved embedding, clustering, and recursive summarization to build a tree structure of document information.*
01:20:09 *🛠️ Code demonstration included creating a vector store, embedding documents, clustering, summarizing, and managing tokens.*
01:22:22 *🔍 CoBER method enhances semantic search by generating embeddings for every token and computing maximum similarities between questions and documents.*
01:24:57 *🧑‍💻 RoBERTA library facilitates playing with CoBER, which showcases good performance but requires evaluating production readiness due to possible latency issues.*
01:26:40 *🌐 CoBER demonstrated through LangChain retriever integration, offering an efficient and unique indexing approach.*
01:28:10 *🗺️ Langraph released for building more complex state machines and diverse logical flows in RAG applications.*
01:33:05 *🔍 Corrective RAG workflow improved retrieval by re-assessing document relevance and performing web searches for ambiguous results.*
01:35:06 *🧩 Functions for state modification in Langraph illustrated how each state (node) in the flow modifies the document retrieval process.*
01:37:08 *🔍 Logical filtering: Use a grading chain to mark documents as relevant or not and perform actions based on the results.*
01:37:32 *🚦 Conditional routing: Based on the 'search' value, route the workflow to either transform the query for a web search or proceed to generate a response.*
01:39:13 *📑 Document relevance check: Filter documents for relevance before transforming the query and performing a web search.*
01:39:55 *🔄 Query transformation: Adjust the query based on information retrieved from a web search to improve relevance.*
01:40:52 *📊 Detailed node inspection: Use tools like LangSmith to inspect each node's output to ensure the logical flow is correct.*
01:42:26 *🚀 Moving from chains to flows: Transitioning from simple chains to complex flows offers cleaner and more sophisticated workflows.*
01:44:06 *🔧 Flow engineering: Flow engineering with Lang graph is intuitive and allows for sophisticated logical reasoning workflows.*
01:45:03 *🧩 Integrating ideas: Combining query analysis and adaptive flow engineering improves your RAG pipeline's efficiency.*
01:46:14 *📚 Corrective workflows: Use unit tests to ensure smooth corrective workflows during model inference.*
01:48:34 *💡 Command R: Uses Command R model with structured output, enabling binary yes/no responses for easier logical flow control.*
01:56:21 *⚙️ Binding functions to nodes: Bind each node in your graph to a specific function to handle different logical decisions and flows.*
01:58:24 *🔄 If tool calls are not in the response, a fallback mechanism is triggered to choose the next data source.*
01:59:18 *🔍 Different data sources (web search vs. Vector store) are used, and their outputs determine the subsequent nodes in the graph.*
02:00:25 *🧾 Conditional edges in the graph handle logic such as document relevance and hallucination checks.*
02:01:05 *📊 Functions are defined as nodes and edges in the graph, following a flow that matches a predefined diagram for logic.*
02:03:18 *🗂️ The flow diagram for the graph aligns with the logic drawn out earlier, ensuring consistent data routing and processing.*
02:05:10 *⏱️ The implemented RAG system processes questions quickly, demonstrating efficient retrieval and generation handling.*
02:07:15 *⚡ Command R model shows rapid performance and effective handling of relevance, hallucination, and answer usefulness checks within the RAG system.*
02:08:55 *🧠 Lang graph provides a reliable, less flexible solution compared to agents, suitable for defined flows and faster implementation.*
02:10:51 *🧩 Agents offer more flexibility for open-ended workflows at the cost of reliability, especially when working with smaller LLMs.*
02:11:46 *💻 Open-source models like Command R can be run locally, enabling fast inference and practical use for online applications.*
02:12:46 *🔧 Practical implementation of RAG systems combines Lang graph with Command R for a fast, reliable solution adaptable for various workflows.*
02:17:09 *📉 Tested GPT-4's ability to retrieve and reason over multiple facts within a large context window, showing degradation in performance as complexity and context length increase.*
02:18:27 *🧩 Observations included the difficulty of retrieving facts placed at the beginning of a large context window, potentially due to a recency bias.*
02:19:10 *🔄 Confirmed that adding reasoning tasks exacerbates retrieval difficulty, highlighting limits within LLMs without a retrieval augmentation system.*
02:19:52 *🚩 Be skeptical of single-needle retrievals as they often oversimplify the retrieval problem.*
02:21:00 *🎯 Focus on the retrieval of precise document chunks, but be cautious of over-engineering.*
02:22:48 *🏗️ Consider document-centric RAG over chunking to simplify retrieval and reduce complexity.*
02:26:30 *🧩 Clustering documents and summarizing clusters help to handle queries requiring multiple pieces of information.*
02:28:07 *🔍 Use long-context embedding models to embed full documents effectively.*
02:31:33 *🖥️ Using open-source models can make RAG systems more accessible and efficient, even on local machines.*

Made with HARPA AI

BlakeGallagher-ih
Автор

The complete happenstance of the phrase "do rag" sounding like "durag" coming from this video was awesome. Sorry, totally unrelated...but it made me chuckle.

jplkid
Автор

Include more of langchain, llms, industry level based tutorials

sagarkeshave
Автор

This felt like a semester condensed in a few hours. This dude reads a lot. I learned about so many of interesting things.

bhavyajain
Автор

Udemy created 50 accounts to dislike this video

nawaz_haider
Автор

Lance thank you for sharing your deep insights on the subject of RAG and taking the time to share this with the community.

Just a question, at 1:04:00 into the overall video concerning the subject of Query Construction. For the question:
"videos that are focused on the topic of chat langchain that are published before 2024"
Should the result have been?:
latest_publish_date: 2024-01-01 as opposed to earliest_publish_date: 2024-01-01

This would be more inline with question:
"videos on chat langchain published in 2023"
where the results where:
earliest_publish_date: 2023-01-01
latest_publish_date: 2024-01-01

Thank you

jasonmuscat
Автор

I was waiting for this particular course. Thanks

faisalmushtaq
Автор

Never stop improving is enough for successful life 🙏

ylrgfdd
Автор

I rarely say that a tutorial is good - but this is an amazing tutorial, extremely underrated!!!

AD-npsh
Автор

🎯 Key points for quick navigation:

02:21:13 *🔄 RAG Evolution*
02:22:20 *❓ Reconsider Chunking*
02:23:42 *📑 Document-Centric RAG*
02:25:20 *🔄 Multi-rep Indexing*
02:26:30 *📊 Utilize Raptor*
02:28:34 *🔄 Beyond Single-Shot*
02:30:23 *🧠 Enhance with Reasoning*
02:30:38 *🎯 Out-of-Scope Queries*

Made with HARPA AI

simrangupta
Автор

Love the teaching style! at 9:00 you mention that you've walked through the code previously. Is there another video to go with this one or did I miss something?

CookingWithGunnar
Автор

so much details - i had to watch to twice to get to understand it just wow

ArunKumar-bplo
Автор

Thanks for the excellent video! If your goal is to democratize gen AI to as diverse an audience as possible, I suggest you stop using OpenAI in these tutorials. In many parts of the world, having a credit card is not an option and OpenAI quickly backs you into that corner. Use, promote and support open-sources alternatives instead. Thank you.

sanjaybhatikar
Автор

This is great content. Speaking of that 95% of private data I guess a lot of practitioner are finding it hard to convince business people to share their data with an LLM provider. And of course concerns are very much understandable. I guess people would feel more comfortable if a RAG application would be able to clearly define a partition of data that it can work on for the benefit of the tool, and a partition that can be either used as obfuscated or simply never shared, not even by chance.

claudiodisalvo
Автор

love from India, keep doing the great work, Lance <3

bhuvanbharath
Автор

Please let us know when the blog related to adaptive RAG will be uploaded, Lance mentioned that he will be uploading it in a day or so. Also I wanted to ask this question to general public, which one is better, State machines or Guardrails?? (In the context of creating complex flows using llms)

yashtiwari
Автор

Great video! What software is used to create these nice diagrams ?

cristian_palau
Автор

Awesome video! it helped a great deal to explain the concept.

ComputingAndCoding
Автор

Is it just me or are there straight up always errors based on changed libs?
Even tho when you simply try to execute their code.
It's really frustrating working with langchain at this point.

riccardomanta
Автор

Tbh, this is not from scratch if you are using an heavily abstracted framework(Langchain). Its misleading.

ajaykumarreddy