How to Improve LLMs with RAG (Overview + Python Code)

Показать описание

In this video, I give a beginner-friendly introduction to retrieval augmented generation (RAG) and show how to use it to improve a fine-tuned model from a previous video in this LLM series.

Resources

--

Socials

The Data Entrepreneurs

Support ❤️

Intro - 0:00
Background - 0:53
2 Limitations - 1:45
What is RAG? - 2:51
How RAG works - 5:03
Text Embeddings + Retrieval - 5:35
Creating Knowledge Base - 7:37
Example Code: Improving YouTube Comment Responder with RAG - 9:34
What's next? - 20:58

Рекомендации по теме

Комментарии

Check out more videos in this series 👇

--

Resources

ShawhinTalebi

Thank you Talebi. No one explains the subject like you

saadowain

This is so helpful! Thanks Shaw, you never miss!

ifycadeau

very nice. thank you for explaining in details.

jagtapjaidip

Awesome video, thanks! I'm wondering if instead of using top_k documents/batches one could define a threshold/distance for the used batches?

firespark

Thankyou so much. Becoming a fan of yours!
Please do a video on Rag with llamaIndex + llama3 if it's free and not paid.

zahrahameed

great video as always 👍
does a reranker improve the quality of the output for a RAG approach? like that we could take the output directly from the reranker, right? or what is your experience with reranker?

nistelbergerkurt

Happy Nowruz, kheyli khoob! Question: how would you propose to evaluate a document on the basis of certain guidelines? I mean, to see how far it complies with the guidelines or regulations for writing a certain document. Is RAG any good? shall we just embed the guidelines in the prompt right before the writing? or shall we store the guidelines as a separate document and do RAG? Or ...?

Pythonology

Hey Shaw, thanks so much for such a helpful video.
I''d love to seek your advice on something :)

Currently we are using OpenAI to build out a bunch of insights that will be refreshed using business data (i.e. X users land on your page, Y make a purchase)
Right now we are doing a lot of data preparation and feeding in the specific numbers into the user/system prompt before passing to OpenAI but have had issues with consistency of output and incorrect numbers.

Would you recommend a fine-tuning approach for this? Or RAG? Or would the context itself be small enough to fit into the "context window" given it's a very small dataset we are adding to the prompt.
Thanks in advance 🙂

candidlyvivian

Nice Video, any ideas for doing this on PowerPoints? Want to build a kind of knowledge base from previous projects but the grafics are a problem. Even GPT4V is not always interpreting them correctly. 😢

TheLordSocke

Hi Talebi. Thanks for all you show us. But one question : I did your code with mine database, without the fine tuning and it works, very quickly answers but poor contents. That is the point of fine tuning make better answers ?

edsleite

So we get top 3 similar chunks from RAG right, We are adding 3 chunks to prompt template?

vamsitharunkumarsunku

Any recommendations or experience on which embeddings database to use?

halle

Rag is great for semi-static or static content as knowledge base, but which path do you use for dynamic, time-relevant data like current sales from a database?

TheRcfrias

hello, do you have a video showing how to make a datasett and upload it to huggind face?

jjen

what do you mean with 'not to scale?' isn't the book at the size of the earth?

CppExpedition

How to protect a company's information with this technology?

JavierTorres-stgt

Vector retrieval is quite shite. Trust me. To improve accuracy of retrieval, you need to use multiple methods.

yameen

How to Improve LLMs with RAG (Overview + Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

How to Improve your LLM? Find the Best & Cheapest Solution

A Survey of Techniques for Maximizing LLM Performance

How to Fine-Tune and Train LLMs With Your Own Data EASILY and FAST- GPT-LLM-Trainer

LASER: Improving LLMs with Layer-Selective Rank Reduction

How to improve LLMs with robustness testing in pre-production

Prompt Engineering Tutorial – Master ChatGPT and LLM Responses

Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search

Jan Čurn - How to feed LLMs with data from the web | WebExpo 2024

All You Need To Know About Running LLMs Locally

How to Build an LLM from Scratch | An Overview

Aligning LLMs with Direct Preference Optimization

What is Retrieval Augmented Generation (RAG) - Augmenting LLMs with a memory

Fine-Tuning LLMs: Best Practices and When to Go Small // Mark Kim-Huang // MLOps Meetup #124

Ep 5. How to Overcome LLM Context Window Limitations

LLM Explained | What is LLM

How to tune LLMs in Generative AI Studio

MoE LLMs with Dense Training for Better Performance

Running a Hugging Face LLM on your laptop

Building with Instruction-Tuned LLMs: A Step-by-Step Guide

LLMLingua: Speed up LLM's Inference and Enhance Performance up to 20x!

AI Unleashed: Install and Use Local LLMs with Ollama – ChatGPT on Steroids! (FREE)

Run Your Own LLM Locally: LLaMa, Mistral & More

Risks of Large Language Models (LLM)