RAG From Scratch: Part 2 (Indexing)

preview_player
Показать описание
This is the second video in our series on RAG. The aim of this series is to build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. This video focuses on indexing, covering the process of document loading, splitting, and embedding.

Code:

Slides:
Рекомендации по теме
Комментарии
Автор

For the very first time in my mechanical engineering life, I think i learned something in detail about software engineering! Thank you!

wjysmkd
Автор

🎯 Key points for quick navigation:

00:02 *📹 The second video in the RAG from Scratch series focuses on indexing, a crucial component of RAG pipelines.*
00:28 *🔍 The goal of indexing is to retrieve documents related to a given question using numerical representations of documents.*
00:53 *📊 Numerical representations of documents are used for easy comparison and search, with approaches including sparse vectors and machine learning-based embedding methods.*
01:08 *💡 Embedding methods compress documents into fixed-length vectors that capture their meaning, allowing for efficient search and retrieval.*
02:03 *📈 Documents are split into smaller chunks to accommodate embedding models' limited context windows, and each chunk is compressed into a vector representation.*

Made with HARPA AI

hxxzxtf
Автор

Thanks. Nice video short and clear. But why do you need to store the embedding in a vector db

hamzafarouk