Retrieval Augmented Generation rag

Показать описание

Retrieval Augmented Generation, or RAG for short, became popular in 2024. RAG lets you interact with a large language model that understands the context of your questions, even if the data is something new to the model, like specific docs from your business. You take data hidden in your systems, put it in a vector store, and index it for easy retrieval to provide to the language model as needed.

When a user asks a question about your data, the RAG system can help find the right answer. RAG uses models like MetaLLAMA 3.1 and tools like OpenParse to break down data, and vector indexes like Postgres to store it. Having both vector and full-text search indexes is important because vector searches can get fuzzy.

Combining these with keyword searches makes the system both context-aware and precise. To make a RAG system, you need these parts: a robust language model, a tool like OpenParse to split documents, Postgres for indexing, an embedding model like NV-Embed-v2 for accurate vectorizing, and sometimes human-provided answers.