A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

Показать описание

Mike Pound discusses how Retrieval Augmented Generation can improve the performance of Large Language Models.

Mike is based at the University of Nottingham's School of Computer Science.

This video was filmed and edited by Sean Riley.

Рекомендации по теме

Комментарии

I made a program last year that uses RAG to help me study for an exam in refrigeration. I had 4 textbooks that were each 500+ pages. Finding the right page for a topic in my course to learn a specific concept was a nightmare. the textbooks chapters were all over the place. I converted the books to embeddings and stored them in a database. I then would ask a question about my refrigeration concept that I wanted to learn, create an embedding for my query and using a comparison algorithm I would retrieve 10 or 15 of the most mathematically similar textbook pages. After retrieving the textbook pages, I fed the text from those pages and my query into an LLM and an answer would spit out for me. It was a great way to learn my niche subject of refrigeration and helped me pass my exam. asking the same question to the LLM alone without the retrieved textbook pages to assist in the context was not giving me reliable answers.

shutton

The presented example wasn't quite RAG. You're just putting more text into the context window. This method quickly falls short if you need to process a big set of reference data, like an entire PDF documentation. Real RAG is a bit more complicated and involves an additional step of converting the reference data to tokens that can be stored, then during inference you first convert the query to tokens, then find best matches with stored data, then use that search to generate excerpts from the original data to feed into your final inference window.

mikoaj

The problem with RAG and LLM's are the same. The risk is that the user takes what is said at face value.
Where RAG really can improve the situation is if the source is provided.
If you have a group of formal documents (such as documents for company procedure) then you should always state the source of that document.
This not only improves the trust of the model, but also narrows down where the user needs to look.

If it is just a black box, it can be hard for the user to know whether the RAG worked or whether it was hallucinating.

penfold-

The word "Strawberry" actually has two R's. I apologize for any confusion caused earlier. - Chat GPT

dukestt

Out of all the people they have Mike is the best (IMO) it would be awesome to do a segment with him on how models like Stable Video Diffusion Image-to-Video work

mscotty

8:12 "Langchain does a lot of other stuff that I'm not using"...langchain in a nutshell

mokopa

I worked on a RAG to make product recommendations, but eventually I was supplying it with too much data as context and it wouldn't work.

I settled on a neat solution: use GPT's ability to call functions and tell it something like, "when the user asks for a recommendation, call the get_recommendations function with a summary of the user's query". It's cool that it gave me a summary because the embeddings are much better than those of a whole sentence or paragraph. So I could take that embedding and look up products based on semantic similarity to the user's query, while it was still generating a response, and then pass the top 10 back to GPT for it to show the user

alastairzotos

It's surprisingly bare bones as an approach. I was expecting something more sophisticated than just sticking the context as part of a promt and literally telling the model to use it in the answer. Reminds me of "promp engineers" sticking a _"and please don't lie"_ at the end of a prompt to decrease hallucinations 😂

Tomyb

I remember having a whole box of the green printer paper. A family friend worked at the state and gave it to me for drawing, etc. some of it had phone numbers and addresses. Long ago in the city dump now.

iabnrk

Feels illegal to be this early to Prof. Pound's lectures

KylerChin

Mike, if when you're finished your career in academia and if you find yourself bored, please considering starting a Youtube channel explaining literally anything vaguely related to computing!

neongensis

More than a few people have been saying recently that yes, most of the LLM output for generic questions is pretty rubbish, but now imagine what happens when most of what they are trained upon is also LLM output. Almost certainly the quality of any results is going to get exponentially worse, no?

BytebroUK

Doesn't RAG make an LLM more susceptible to prompt injection hijacking. If I can get an LLM to grab data that I control that itself includes prompt injection attacks, then RAG is giving me a way to possibly bypass some of the prompt sanitation that is built into the LLM general interface. The hacker WunderWuzzi seems to leverage these edge cases in a lot of his recent AI security research.

jimjones

Given that LLMs hold knowledge in their weights, but are also taking in knowledge through RAG I wonder how those things interact if the sources of knowledge conflict... Specifically what happens if an LLMs learned knowledge falls behind compared to stuff like Wikipedia that's updated constantly? Or on the opposite end, can you poison a model using RAG by deliberately feeding in bad knowledge as part of that additional context...

Imperial_Squid

Funny. just had to do this in a hackathon last week :)

garcipat

I'm a simple person.
I see Mike Pound, I click on the video.

frankbucciantini

It would be interesting to get a sense of how much the context helps. What -would- the answer have been without it? If I really did have context about things the model itself could not have learned, how does it do?

jameshiggins-thomas

9:22 Something about writing the initial prompt to the LLM in second person has always rubbed me the wrong way. Wouldn't it play a lot better to the strengths of an LLM to write a prompt like "below is a transcript of a conversation where a chatbot successfully answers a user's question" rather than a prompt like "You are an AI assistant who answers questions"?

I understand that instruction-tuned models are tuned to handle these second-person prompts, but it seems like a weird stopgap.

HeroOfHyla

I am JUST now studying this. This very uncanny computerphile

thecompanioncube

The fact this channel doesnt have daily uploads is sad af

Xjaychax

A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

Mano: Your LLM-powered helping hand

LLM Explained | What is LLM

AI as a helping hand, not the final answer: Promoting responsible usage in research

'Top 5 Business Applications of LLM Agents! 🚀 AI-Powered Super Assistants'

EfficientML.ai Lecture 13 - LLM Deployment Techniques (MIT 6.5940, Fall 2024)

Are AI Coding Assistants Helping or Hurting You? my thoughts on LLM coding and AI assistants.

Jay Alammar on LLMs, RAG, and AI Engineering

Can LLMs Generate Novel Research Ideas? A Large Scale Human Study with 100+ NLP Researchers 2409

Lessons From A Year Building With LLMs

Best 12 AI Tools in 2023

Hands-On with Large Language Models: A Must-Read #llms

'Catching up on the weird world of LLMs' - Simon Willison (North Bay Python 2023)

Retrieval Augmented Generation explained for Beginners | RAG in LLMs

All Indian LLMs explained in 10 minutes

EfficientML.ai Lecture 14 - LLM Post-Training (MIT 6.5940, Fall 2024)

EfficientML.ai Lecture 13 - LLM Deployment Techniques (MIT 6.5940, Fall 2024, Zoom Recording)

Hands on LLM and RAG

Building with Instruction-Tuned LLMs: A Step-by-Step Guide

Generative AI with Large Language Models: Hands-On Training feat. Hugging Face and PyTorch Lightning

EfficientML.ai Lecture 14 - LLM Post-Training (MIT 6.5940, Fall 2024, Zoom Recording)

Mirror, mirror: LLMs and the illusion of humanity - Jodie Burchell - NDC Oslo 2024

Arize AI Phoenix: Open-Source Tracing & Evaluation for AI (LLM/RAG/Agent)

Running Microsoft Phi-3 LLM using Hugging Face | Hands-on deployment