filmov
tv
Building long context RAG with RAPTOR from scratch
Показать описание
The rise of long context LLMs and embeddings will change RAG pipeline design. Instead of splitting docs and indexing doc chunks, it will become feasible to index full documents. RAG approaches will need to flexibly answer lower-level questions from single documents or higher-level questions that require information across many documents.
RAPTOR (Sarthi et al) is one approach to tackle this by building a tree of document summaries: docs are clustered and clusters are summarized to capture higher-level information across similar docs.
This is repeated recursively, resulting in a tree of summaries from individual docs as leafs to intermediate summaries of related docs to high-level summaries of the full doc collection.
In this video, we build RAPTOR from scratch and test it on 33 web pages (each ranging 2k - 12k tokens) of LangChain docs using the recently released Claude3 model from Anthropic to build the summarization tree. The pages and tree of summaries are indexed together for RAG with Claude3, enabling QA on lower-lever questions or higher-level concepts (captured in summaries that span related pages).
This idea can scale to large collections of documents or to documents of arbitrary size (up to embd / LLM context window).
Code:
Paper:
RAPTOR (Sarthi et al) is one approach to tackle this by building a tree of document summaries: docs are clustered and clusters are summarized to capture higher-level information across similar docs.
This is repeated recursively, resulting in a tree of summaries from individual docs as leafs to intermediate summaries of related docs to high-level summaries of the full doc collection.
In this video, we build RAPTOR from scratch and test it on 33 web pages (each ranging 2k - 12k tokens) of LangChain docs using the recently released Claude3 model from Anthropic to build the summarization tree. The pages and tree of summaries are indexed together for RAG with Claude3, enabling QA on lower-lever questions or higher-level concepts (captured in summaries that span related pages).
This idea can scale to large collections of documents or to documents of arbitrary size (up to embd / LLM context window).
Code:
Paper:
Комментарии