RAG But Better: Rerankers with Cohere AI

preview_player
Показать описание
Rerankers have been a common component of retrieval pipelines for many years. They allow us to add a final "reranking" step to our retrieval pipelines — like with Retrieval Augmented Generation (RAG) — that can be used to dramatically optimize our retrieval pipelines and improve their accuracy.

In this video we'll learn about rerankers, how they compare to the more common embedding retrieval only setup, and how we can create retrieval pipelines with reranking using Cohere AI reranking model. We'll also be using the (more typical) OpenAI text-embedding-ada-002 model with the Pinecone Vector Database.

📌 Code (08:32):

📚 Article:

🌲 Subscribe for Latest Articles and Videos:

👋🏼 AI Consulting:

👾 Discord:

00:00 RAG and Rerankers
01:25 Problems of Retrieval Only
04:32 How Embedding Models Work
06:34 How Rerankers Work
08:20 Implementing Reranking in Python
13:11 Testing Retrieval without Reranking
15:21 Retrieval with Cohere Reranking
21:54 Tips for Reranking

#artificialintelligence #nlp #ai #openai
Рекомендации по теме
Комментарии
Автор

cohere just released rerank3 and it wokred increiblely fantastic with openai's embedding 3 model; thanks for your kind intro

slayermm
Автор

You’ve got top notch editing + technical explanations and none of that is easy. The amt of work to create a 20 min video, and be cohesive on such a topic is amazing. Thanks! 🔥 all ur videos are so helpful and just interesting to watch and learn

justinwlin
Автор

My approach is letting the LLM summarize the user's input first, the prompt could be written as: "The summary of the user's request to semantically search relevant documents in English." The output of the LLM's summarization can then be used to query the vector database after embedding. This approach may potentially increase the accuracy of retrieval.

real-ethan
Автор

I learned a lot from this, thank you. You say you plan a series, and you were talking about other topics for the series but these other topics you mentioned were not about rerankers. I noted that this video treats rerankers as black boxes so you could even expand the series. I for sure would be interested in: what are the most recent reranking models, how doe rerankers work, is it feasible to make a reranker yourself or does this require, just like a transformer, that you scrape the entire language / internet? In other words, this video was very interesting, but now I know about rerankers I have lots and lots of questions about rerankers.

jantuitman
Автор

Each video gets better! Thank you for your work!

gitmaxd
Автор

Thanks man. You are improving my hobby projects in real time.

adityavd
Автор

My god, thank you 🙏 as someone that only rebuilds the wheel, your content is very much appreciated.

Cdaprod
Автор

Thank you for this video, been stuck in RAG realm with llama index and not satisfied, I thought similar reranking but manually, i will try cohere today instead

narayangopalmaharjan
Автор

any benchmark? otherwise is kinda of very empirical and only seems like a sponsored video by Cohere

frazuppi
Автор

Can anyone explain this. Why we are using reranker to rank, is it not the work of retriever(to rank on basis of cosine similarty or something else, and return the relavant chunks)?

Ishant
Автор

Top notch material, James. Much appreciated 🎉🎉 Really curious to see what kind of difference this makes in my projects. Thanks!

jellederijke
Автор

I've been doing this with transformers I think theirs a alot to think about with doing this efficiently but it does get the best results!

Jandodev
Автор

Hi, thank you so much for that content! Do you think that parameters like document chunks size and overlapping are important for RAG accuracy? Should we fine-tune them in some way?

matteomarjanovic
Автор

Thank you for making this. Fascinating.

Shaunmcdonogh-shaunsurfing
Автор

Great video, looking forward to more on this!

samwilletts
Автор

why cant we just select relevant records according to index? do we need to select all records from top to bottom all the time?

Youtube_premiumhquq
Автор

I don't get the part that you feed both documents in the same transformer. If your transformer output is only 1 array, what are you comparing to? You have only 1 array to compare to... nothing? What did I miss?

hetnon
Автор

I don’t understand how re-ranking is adding anything. you’re giving it the same query again and they’ve already been matched with a vector similarity what additional information is using your improve the ranking? Thx!

dato
Автор

You mentioned some better approaches than reranking. Any hints as to what that might be (curious to know if it involves fine tuning the LLM with the data too)

Shaunmcdonogh-shaunsurfing
Автор

Any chance you can show examples with OpenSource re ranking like: JinaAI-v2-base-en
for example?

LoVeRSaMa