Private RAG with Open Source and Custom LLMs 🚀 | BentoML | OpenLLM

Показать описание

In this session, Chaoyu Yang, Founder and CEO at BentoML, talked about the practical considerations of building private Retrieval-Augmented Generation (RAG) applications, utilizing a mix of open source and custom LLMs.

Topics that were covered:

✅ The benefits of self-hosting open source LLMs or embedding models for RAG.

✅ Common best practices in optimizing inference performance for RAG.

✅ BentoML for building RAG as a service, seamlessly chaining language models with various components, including text and multi-modal embedding, OCR pipelines, semantic chunking, classification models, and reranking models.

About LLMOps Space -

LLMOps.Space is a global community for LLM practitioners. 💡📚
The community focuses on content, discussions, and events around topics related to deploying LLMs into production. 🚀

Рекомендации по теме

Комментарии

Thanks for putting the recording and sending an email notification about it!

tatvafnu

Good interview, I really enjoy it. Uniquely, I think it would be great less interruptions between each slide that Chaoyu was talking. It is only a constructive criticism, I think that distracts the audience. Keep it going, amazing content!

nachoeigu

Private RAG with Open Source and Custom LLMs 🚀 | BentoML | OpenLLM

Private RAG with Open Source and Custom LLMs 🚀 | BentoML | OpenLLM

Building a RAG application using open-source models (Asking questions from a PDF using Llama2)

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

What is Retrieval-Augmented Generation (RAG)?

Kotaemon: Ultimate RAG UI For Chatting With Your Documents! (Opensource)

Building Production-Ready RAG Applications: Jerry Liu

Easy 100% Local RAG Tutorial (Ollama) + Full Code

Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial)

LLM-1: Large Language Models Bootcamp, Session 3

Open Source RAG running LLMs locally with Ollama

Local Agentic RAG with LLaMa 3.1 - Use LangGraph to perform private RAG

Vector databases are so hot right now. WTF are they?

Demo: Private RAG with local Mistral 7b LLM and Weaviate on K8s

Embedchain: BEST Way to Create Powerful LLM Apps Using RAG! (Opensource)

Build your own RAG (retrieval augmented generation) AI Chatbot using Python | Simple walkthrough

Create private Open source LLM and RAG in your local machine - 1 min intuitive understanding

RAG + Langchain Python Project: Easy AI/Chat For Your Docs

Llama3 Full Rag - API with Ollama, LangChain and ChromaDB with Flask API and PDF upload

Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer

FREE: Reor AI-Powered Note Taking [RAG!] 🤖📝 Open Source Alternative To Obsidian & Notion!

Better Llama 2 with Retrieval Augmented Generation (RAG)

Verba: Ultimate RAG Engine - Semantic Search, Embeddings, Vector Search, & More!

ADVANCED Python AI Agent Tutorial - Using RAG

Industrial-scale Web Scraping with AI & Proxy Networks