Best Practices for Building Production RAG - Part 1

Показать описание

🤔 Looking for the ultimate roadmap for implementing your RAG in production?

In this episode, join Angelina and Mehdi, for a discussion of recommended approaches for each step of building your RAG system in production.

What You'll Learn:
🔎 Detailed discussion of different RAG components and techniques
🚀 Insights and recommendations from the paper on chunking, embedding, and vector databases
🛠 Emphasis on the need for a balanced and context-aware approach when implementing RAG in production

✏️ In This Episode:
00:00 Intro
00:55 Implementing production RAG is hard
02:20 Can we identify optimal RAG practices?
03:26 Approaches of this paper
05:09 The diagram of RAG production flow
05:19 Chunking
07:39 Chunk size
09:09 Faithfulness and relevancy
10:53 Chunking techniques

🖼️ Blogpost for today:
How to Choose the Right Vector Search System for Your RAG Application

Stay tuned for more content! 🎥 Thanks you for watching! 🙌

TwoSetAI

Рекомендации по теме

Комментарии

We'd love to see you there! 🎉

In the course, you'll have the chance to connect directly with Professor Mehdi (just like I do 😉 in the videos), and you can even ask him your questions 1:1. Bring your real work projects, and during our office hours, we'll help you tackle your day-to-day challenges.

This course is for:
01 👇
𝗔𝗜 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 & 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿𝘀: For AI engineers/developers looking to master production-ready RAG systems combining search with AI models.
02 👇
𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝘁𝗶𝘀𝘁𝘀: Ideal for data scientists seeking to expand into AI by learning hands-on RAG techniques for real-world applications.
03 👇
𝗧𝗲𝗰𝗵 𝗟𝗲𝗮𝗱𝘀 & 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗠𝗮𝗻𝗮𝗴𝗲𝗿𝘀: Perfect for tech leads/product managers wanting to guide teams in building and deploying scalable RAG systems

TwoSetAI

Interesting video, but I have issues with the paper. (1) Optimizing each step and assuming that will give the global optimum seems a bit naïve. (2) I'm surprised by the exclusion of chunking strategies like LangChain's recursive chunker. It seems hard to see how a simplistic token count based chunking could ever be better than one that takes into account paragraphs etc (and it's probably faster than sentence level chunking).

karlfimm

Best Practices for Building Production RAG - Part 1

Best Practices for Building Production RAG - Part 1

Best Practices for Building Production-Grade Deep Learning Systems - Level 300 (United States)

Top 8 Docker Best Practices for using Docker in Production

4 Tips for Building a Production-Ready FastAPI Backend

The Best Kept Secret in Construction | Michael Johnson | TEDxDavenport

8 Terraform Best Practices that will improve your TF workflow immediately

AWS Summit Online ASEAN 2020 | Best Practices for Building Production-grade Deep Learning Systems

Building for Failure - Best Practices for Easy Production Debugging

Inclusive Education in Developing Countries: Building Pathways for Inclusive Futures

Building Production-Ready RAG Applications: Jerry Liu

30 NEED To Know Build Tips In Satisfactory

3 ways to create a work culture that brings out the best in employees | Chris White | TEDxAtlanta

10 React Antipatterns to Avoid - Code This, Not That!

The IDEAL & Practical CI / CD Pipeline - Concepts Overview

How Senior Programmers ACTUALLY Write Code

Top 10 Tips to Running a Successful Construction Business

How I Plan My Coding Projects

Rules to Building a Winning Team

Framing Pro Tips

20 + Cleaner Factory Designs Tips | Satisfactory Update 5

Top 5 Most-Used Deployment Strategies

10 DUMB (and Common) Building Practices

How to design a Lean Layout/Obeyaka? The Lean Manufacturing Guide

5 must-know 3D printing tips & tricks. (stronger and better looking prints)