Best Practices for Building Production RAG - Part 1

preview_player
Показать описание
🤔 Looking for the ultimate roadmap for implementing your RAG in production?

In this episode, join Angelina and Mehdi, for a discussion of recommended approaches for each step of building your RAG system in production.

What You'll Learn:
🔎 Detailed discussion of different RAG components and techniques
🚀 Insights and recommendations from the paper on chunking, embedding, and vector databases
🛠 Emphasis on the need for a balanced and context-aware approach when implementing RAG in production

✏️ In This Episode:
00:00 Intro
00:55 Implementing production RAG is hard
02:20 Can we identify optimal RAG practices?
03:26 Approaches of this paper
05:09 The diagram of RAG production flow
05:19 Chunking
07:39 Chunk size
09:09 Faithfulness and relevancy
10:53 Chunking techniques

🖼️ Blogpost for today:
How to Choose the Right Vector Search System for Your RAG Application

Stay tuned for more content! 🎥 Thanks you for watching! 🙌
Рекомендации по теме
Комментарии
Автор


We'd love to see you there! 🎉

In the course, you'll have the chance to connect directly with Professor Mehdi (just like I do 😉 in the videos), and you can even ask him your questions 1:1. Bring your real work projects, and during our office hours, we'll help you tackle your day-to-day challenges.

This course is for:
01 👇
𝗔𝗜 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 & 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿𝘀: For AI engineers/developers looking to master production-ready RAG systems combining search with AI models.
02 👇
𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝘁𝗶𝘀𝘁𝘀: Ideal for data scientists seeking to expand into AI by learning hands-on RAG techniques for real-world applications.
03 👇
𝗧𝗲𝗰𝗵 𝗟𝗲𝗮𝗱𝘀 & 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗠𝗮𝗻𝗮𝗴𝗲𝗿𝘀: Perfect for tech leads/product managers wanting to guide teams in building and deploying scalable RAG systems

TwoSetAI
Автор

Interesting video, but I have issues with the paper. (1) Optimizing each step and assuming that will give the global optimum seems a bit naïve. (2) I'm surprised by the exclusion of chunking strategies like LangChain's recursive chunker. It seems hard to see how a simplistic token count based chunking could ever be better than one that takes into account paragraphs etc (and it's probably faster than sentence level chunking).

karlfimm