Lessons Learned on LLM RAG Solutions

preview_player
Показать описание
We’re going to do a technical deep dive into Retrieval Augmented Generation, or RAG, one of the most popular Generative AI projects. There is a ton of content about RAG applications with LLMs, but very little addresses the challenges associated with building practical applications. Today you’re going to get the inside scoop from some engineers with that experience.

LLMs can be used to convert documents like emails or contracts to sets of vectors called embeddings. Embeddings can be used to find sets of text that are more similar in meaning. The most common business applications are semantic search, searching based on meaning and not keywords, and document Q&A. Each step presents unique challenges, and we're going to address them today.
Рекомендации по теме
Комментарии
Автор

Very refreshing to see something about RAG that goes beyond surface level

sprobertson
Автор

🎯 Key Takeaways for quick navigation:

00:02 🧭 *RAG Understanding and Challenges*
- RAG, or Retrieval Augmented Generation, is a popular Generative AI project facing challenges like Echo and the need for context in responses.
02:10 🔄 *RAG in Action and Improvement*
- Demonstrates RAG with an example of an employee querying about bringing a dog to work, highlighting improved responses with retrieved policies.
03:05 📚 *Automatic Retrieval Application*
- Explains the application process of automatically retrieving relevant information for RAG, covering obtaining documents, chunking, embedding, and generating responses.
04:29 🛠️ *RAG as a Customization Tool*
- Discusses RAG as a practical way to customize language models, emphasizing its use cases and significance in adapting to diverse datasets.
07:39 📄 *Parsing Documents in Real-world Applications*
- Emphasizes the need to parse various document types, discussing challenges with tables and messy real-world data.
09:44 🧩 *Importance of Document Hierarchy*
- Explores maintaining document hierarchy for meaningful embeddings, stressing the need for a flat representation.
15:26 🕵️ *Ensuring Relevant Retrieval*
- Emphasizes the retrieval step's importance in RAG applications and discusses the impact of incorrect retrieval on response accuracy.
18:54 🎯 *Investment Priority: Retrieval Over Generation*
- Advocates investing time in perfecting the retrieval step, acknowledging the complexities beyond choosing the right embedding model.
21:23 🤷‍♂️ *Challenges in Evaluating RAG Applications*
- Explores difficulties in evaluating RAG applications due to diverse implementation methods and emphasizes the need for comprehensive evaluation metrics.
21:57 📚 *Evaluation of RAG Applications*
- Evaluation involves assessing faithfulness, measuring alignment with evidence, and avoiding hallucinations. Challenges lie in nuanced evaluation given diverse user queries.
27:13 🤖 *Challenges in Evaluating RAG Applications*
- Multiple choice evaluations simplify the process but may introduce biases. The variation in user queries requires adaptive systems, highlighting the ambiguity in assessing intelligent systems.
28:17 🚀 *Techniques for Improving RAG Performance*
- Enhancing search capabilities involves using embeddings, metadata, rules, or heuristics. Summarization during retrieval, diversifying queries, and addressing varied inputs improve efficacy.
32:27 🔄 *Fine-tuning and Summarization in RAG*
- Fine-tuning components like the embedding model or using adapters tailors RAG for different applications. Summarization techniques enhance summaries by coalescing information into fewer sentences, emphasizing the need for specific directions in summarization requests.

Made with HARPA AI

humbertomejia
Автор

Just to mention the summarisation technique they mention at the end is 'Chain of Density'. Iteratively making the summary more and more dense

alexmolyneux
Автор

Thank you. More useful than most conversations on the topic. Heuristics is clearly still a major space in this next wave of AI.

joeaccent
Автор

I was wondering if you have some suggestions on optimizing the documentation being used for RAG. We're using RAG linked to our Notion 'wiki', and I want to implement guidelines for the info being added, to ensure it is 'ai friendly'.

Shishiranshoku
Автор

Thank you! very practical and up to date discussion

AlonAvramson
Автор

Fantastic presentation!

A question directed at Justin: When executing multiple queries that have slight variations, what method do you employ to aggregate or coalesce the responses into a unified result? Do you use a LLM to serve as a judge for this aggregation?

rickrischter
Автор

My first YouTube Live! The beginning was a bit rough because I started hearing my own voice in the background. It turns out I had another Chrome tab with this page open and it started playing automatically in the background. I paused because I couldn't figure out what was happening. Lesson learned for next time: close your other tabs.

prolegoinc
Автор

Please link the video mentioned in the description and tag me when you get a chance. I’m just learning the RAG aspect but have theoretically visioned the application case I’d like to focus on. Thank you very much for the informative discussion!

LandingBusiness
Автор

Great discussion. Thanks for sharing this

carvalhoribeiro
Автор

Great information, some useful takeaways, thanks.

Have you experimented much with hybrid retrieval, vector search + keyword search, to retrieve accurate chunks?

humbledev-mpzz