Mastering Retrieval for LLMs - BM25, Fine-tuned Embeddings, and Re-Rankers

Показать описание

VIDEO RESOURCES:

TIMESTAMPS:
0:00 Mastering Retrieval (RAG) for LLMs
0:44 Video Overview
13:19 Baseline Performance with No Retrieval
17:29 Document Chunking - Naive vs Sentence based
24:34 BM25
33:20 Semantic / Vector / Embeddings Search
39:59 Cosine vs Dot Product Similarity
43:21 Generating Chunks and Embeddings
50:50 Running BM25 and Similarity Retrieval
55:22 Performance with BM25 vs Similarity
58:36 Fine-tuning embeddings / encoders
1:04:00 Preparing fine-tuning datasets
1:14:54 Embeddings Training Continued
1:22:00 Performance after Fine-tuning
1:25:58 Re-rankers
1:27:10: Cross-encoders
1:30:47 LLM re-rankers
1:36:11 Re-ranking performance
1:48:50 Final Tips

Рекомендации по теме

Комментарии

Thank you trellis!!! Awesome video as always, probably one of the best technical channels right now. Best

sergialbert

Amazing, this video is a treasure! Thanks a lot for explaining in depth. Very great job!

MortaAriyano

Amazing video, really the best explaination for the RAG pipeline I saw on YT. Great job!

TheMariolino

You are the best. Cant wait to try this out over the weekend!

KopikoArepo

Hey trelis you may have about 10k subs only but I really do appreciate all your videos. I personally learn and benefit a lot from them and I always recommend a friend of mine your videos for any detailed explanation required. I do have some questions from this video which is probably my favourite so far and I'm trying to understand in every possible way.

1) How can we know if a model was trained using dot product or cosine?
2) Can you please explain if the dot product when trying to calculate the cosine similarity is the same as the dot product you were comparing with before. Also, could you give an example about normalizing and could they standardize instead of normalize? I've always been confused about those terms and I'm not sure if they are related in this case. "In terms of computation power, its quicker to do dot products cuz in cosine ur finding the angle between 2 vectors which u'd first do dot product then normalize"
2) Regarding the Retrieval performance, is there any reason why you picked top 12 chunks? Also, does that mean if I tried top 20 chunks I can achieve near 100% accuracy?

seththunder

Loved the depth you went into, really enjoyed the video! Quick one, if you were to apply rag to a dataframe, how would you go about it? Converting to strings, then embedding each row as a chunk feels clunky but maybe it's the way to go? I guess with the context lengths available at this stage we could almost just convert entire dfs to strings and feed them in.

SeánCarmody-yp

Most interesting ! Thank you for sharing this video with us. I would be most interested if you could try something like LLMLingua to compress the context. Actually, I was wondering about using that on the chuncks to make them more efficient. Also, to have a response that could be checked against the knowledge source of the RAG, I'd be interested in LLM that can give citations of the relevant source chunks (assigning ids when chunking, before any compression). Do you have any experience on that ? How hard would it be to fine tune a model for RAG with citations ? Thx !

testcomptetest

Have you tried your pipeline on a different dataset for the test data? Maybe something like basketball rules instead.

cheeyuanng

Thanks for the great video! How does this solution scale? I can see the benefit of finetuning the embeddings for smaller data corpora, but does it do as well for large data corpora that have thousands of documents for different domains of knowledge, and does the finetuning still benefit if more documents are added at a later time that is in a different domain of knowledge?

unshadowlabs

Hi. Nice video, but I didn't get how to prepare a dataset. How to get a comprehensive list of questions and answers about my document?

VerdonTrigance

hey bro have you experimented with graphRAG ? appreciate the video. Learning every day about RAG..

awakenwithoutcoffee

thanks for this!
What changes would you implement if there are a large number (50+) of pdfs (100+ pages with embedded images and text)?

jeevanable

can you discuss GraphRag recently released by microsoft?

wryltxw

- if reranker is only good for similarity why not applying it only on it and after add bm 25 results ?
- why not finetune also the reranker ?

loicbaconnier

Any advantage in using BM25 instead of skelarn's TF-IDF?

Bragheto

trelis i have one problem. I am working on NL to SQL problem. i have written column descriptions of each column for each table in my database and then i comverted those descriptions into embeddings and stored them. now when user questions come, i convert that question into embeddings and then i multiply this with embedding of each column description we have created earlier. then i select top 20 columns based on cosine similarity score. but the thing is i mostly miss one or two columns doing this. questions is of one line, menas it dosen't include many details to get relevant columns and sometimes irrelevant columns gives higher cosine score and i miss relevant ones. do you have any idea how can i approach this problem? the only solution i see is increasing the number of columns i am selecting but it increases the prompt size that i give to LLM in the input. and you know there is limited context window for LLMs.

TemporaryForstudy

Mastering Retrieval for LLMs - BM25, Fine-tuned Embeddings, and Re-Rankers

Mastering Retrieval for LLMs - BM25, Fine-tuned Embeddings, and Re-Rankers

Building Production-Ready RAG Applications: Jerry Liu

Mastering LLMs with RAG

Generative AI / LLM - Document Retrieval and Question Answering

What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt

Different methods of using an LLMs! #llmwithav #learnwithav #llm #datascience #generativeai

Adding Agentic Layers to RAG

Mastering AI Jargon - Your Guide to OpenAI & LLM Terms

LLMs as Planners - Reasoning versus Retrieval

Roadmap to Learn Generative AI(LLM's) In 2024 With Free Videos And Materials- Krish Naik

Advanced Retrieval Augmented Generation (RAG): Build a Powerful llm application using Re-ranker

Mastering LLMs with Shanif Dhanani of Locusive

Unlocking LLM Potential: Mastering Medprompt for Enhanced AI in Healthcare

Build a Retrieval-Augmented Generation Chatbot in 5 Minutes

Prompt Engineering Tutorial – Master ChatGPT and LLM Responses

Mastering the Giants: Tackling the Challenges of LLMs in the Real World | PatternAI | Krupa Galiya

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Mastering ChatGPT: Tips and Tricks for LLMs!

Mastering LLMs: Balancing Power and Unpredictability in AI Solutions

How to prepare for GenAI/LLM Interview? #artificialintelligence

Implement RAG (Retrieval Augmented Generation) #ai #springboot

Mastering RAG Evaluation: Metrics and Methods | Retrieval-Augmented Generation

Mastering LLM Accuracy of ServiceNow LLM Applications with Prompt Engineering

LangChain Master Class For Beginners 2024 [+20 Examples, LangChain V0.2]