RAG But Better: Rerankers with Cohere AI

Показать описание

Rerankers have been a common component of retrieval pipelines for many years. They allow us to add a final "reranking" step to our retrieval pipelines — like with Retrieval Augmented Generation (RAG) — that can be used to dramatically optimize our retrieval pipelines and improve their accuracy.

In this video we'll learn about rerankers, how they compare to the more common embedding retrieval only setup, and how we can create retrieval pipelines with reranking using Cohere AI reranking model. We'll also be using the (more typical) OpenAI text-embedding-ada-002 model with the Pinecone Vector Database.

📌 Code (08:32):

📚 Article:

🌲 Subscribe for Latest Articles and Videos:

👋🏼 AI Consulting:

👾 Discord:

00:00 RAG and Rerankers
01:25 Problems of Retrieval Only
04:32 How Embedding Models Work
06:34 How Rerankers Work
08:20 Implementing Reranking in Python
13:11 Testing Retrieval without Reranking
15:21 Retrieval with Cohere Reranking
21:54 Tips for Reranking

#artificialintelligence #nlp #ai #openai

Рекомендации по теме

Комментарии

cohere just released rerank3 and it wokred increiblely fantastic with openai's embedding 3 model; thanks for your kind intro

slayermm

You’ve got top notch editing + technical explanations and none of that is easy. The amt of work to create a 20 min video, and be cohesive on such a topic is amazing. Thanks! 🔥 all ur videos are so helpful and just interesting to watch and learn

justinwlin

My approach is letting the LLM summarize the user's input first, the prompt could be written as: "The summary of the user's request to semantically search relevant documents in English." The output of the LLM's summarization can then be used to query the vector database after embedding. This approach may potentially increase the accuracy of retrieval.

real-ethan

I learned a lot from this, thank you. You say you plan a series, and you were talking about other topics for the series but these other topics you mentioned were not about rerankers. I noted that this video treats rerankers as black boxes so you could even expand the series. I for sure would be interested in: what are the most recent reranking models, how doe rerankers work, is it feasible to make a reranker yourself or does this require, just like a transformer, that you scrape the entire language / internet? In other words, this video was very interesting, but now I know about rerankers I have lots and lots of questions about rerankers.

jantuitman

Each video gets better! Thank you for your work!

gitmaxd

Thanks man. You are improving my hobby projects in real time.

adityavd

My god, thank you 🙏 as someone that only rebuilds the wheel, your content is very much appreciated.

Cdaprod

Thank you for this video, been stuck in RAG realm with llama index and not satisfied, I thought similar reranking but manually, i will try cohere today instead

narayangopalmaharjan

any benchmark? otherwise is kinda of very empirical and only seems like a sponsored video by Cohere

frazuppi

Can anyone explain this. Why we are using reranker to rank, is it not the work of retriever(to rank on basis of cosine similarty or something else, and return the relavant chunks)?

Ishant

Top notch material, James. Much appreciated 🎉🎉 Really curious to see what kind of difference this makes in my projects. Thanks!

jellederijke

I've been doing this with transformers I think theirs a alot to think about with doing this efficiently but it does get the best results!

Jandodev

Hi, thank you so much for that content! Do you think that parameters like document chunks size and overlapping are important for RAG accuracy? Should we fine-tune them in some way?

matteomarjanovic

Thank you for making this. Fascinating.

Shaunmcdonogh-shaunsurfing

Great video, looking forward to more on this!

samwilletts

why cant we just select relevant records according to index? do we need to select all records from top to bottom all the time?

Youtube_premiumhquq

I don't get the part that you feed both documents in the same transformer. If your transformer output is only 1 array, what are you comparing to? You have only 1 array to compare to... nothing? What did I miss?

hetnon

I don’t understand how re-ranking is adding anything. you’re giving it the same query again and they’ve already been matched with a vector similarity what additional information is using your improve the ranking? Thx!

dato

You mentioned some better approaches than reranking. Any hints as to what that might be (curious to know if it involves fine tuning the LLM with the data too)

Shaunmcdonogh-shaunsurfing

Any chance you can show examples with OpenSource re ranking like: JinaAI-v2-base-en
for example?

LoVeRSaMa

RAG But Better: Rerankers with Cohere AI

RAG But Better: Rerankers with Cohere AI

Advanced RAG Concept: Improving RAG with Multi-stage Document Reranking

Advance RAG: LlamaParse + Reranker = Better RAG

Document Re-ranking using LLMs - Advanced RAG

LangChain Advanced RAG - Two-Stage Retrieval with Cross Encoder (BERT)

Next-Level ReRanking with FlashRank: A Speedy Solution for Advanced RAG

Better RAG with Merger Retriever (LOTR) and Re-ranking Retriever (Long Context Reorder)

Building Production-Ready RAG Applications: Jerry Liu

Advance RAG 07 - Flash Reranker for Superfast Reranking

Advanced RAG 03 - Reranking with Sentence Transformers and BM25 API

Semantic Chunking - 3 Methods for Better RAG

Improving RAG Applications with Reranker Models #ai #machinelearning

Don't naive RAG do hybrid search instead (Pinecone Weaviate or pgvector + full text search &...

Understanding Reciprocal Rank Fusion in Hybrid Search [Advanced RAG]

When Do You Use Fine-Tuning Vs. Retrieval Augmented Generation (RAG)? (Guest: Harpreet Sahota)

LangChain - Advanced RAG Techniques for better Retrieval Performance

Advanced RAG Cross Encoder Reranker for Improving Accuracy Zephyr 7B Alpha llamaindex Colab Demo

What is Retrieval-Augmented Generation (RAG)?

Advanced RAG with Llama 3 in Langchain | Chat with PDF using Free Embeddings, Reranker & LlamaPa...

Advanced RAG 04 - Reranking with Cross Encoders, and Cohere API

Taking RAG Pipelines in Haystack to the Next Level With Document Ranking, Tuana Celik, Deepset

Implementing Advanced RAG Technique: Self Querying

😲 Building Advanced RAG systems #ai

Retrieval Augmented Generation (RAG) with Genkit