A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

preview_player
Показать описание
Mike Pound discusses how Retrieval Augmented Generation can improve the performance of Large Language Models.

Mike is based at the University of Nottingham's School of Computer Science.

This video was filmed and edited by Sean Riley.

Рекомендации по теме
Комментарии
Автор

I made a program last year that uses RAG to help me study for an exam in refrigeration. I had 4 textbooks that were each 500+ pages. Finding the right page for a topic in my course to learn a specific concept was a nightmare. the textbooks chapters were all over the place. I converted the books to embeddings and stored them in a database. I then would ask a question about my refrigeration concept that I wanted to learn, create an embedding for my query and using a comparison algorithm I would retrieve 10 or 15 of the most mathematically similar textbook pages. After retrieving the textbook pages, I fed the text from those pages and my query into an LLM and an answer would spit out for me. It was a great way to learn my niche subject of refrigeration and helped me pass my exam. asking the same question to the LLM alone without the retrieved textbook pages to assist in the context was not giving me reliable answers.

shutton
Автор

The presented example wasn't quite RAG. You're just putting more text into the context window. This method quickly falls short if you need to process a big set of reference data, like an entire PDF documentation. Real RAG is a bit more complicated and involves an additional step of converting the reference data to tokens that can be stored, then during inference you first convert the query to tokens, then find best matches with stored data, then use that search to generate excerpts from the original data to feed into your final inference window.

mikoaj
Автор

The problem with RAG and LLM's are the same. The risk is that the user takes what is said at face value.
Where RAG really can improve the situation is if the source is provided.
If you have a group of formal documents (such as documents for company procedure) then you should always state the source of that document.
This not only improves the trust of the model, but also narrows down where the user needs to look.

If it is just a black box, it can be hard for the user to know whether the RAG worked or whether it was hallucinating.

penfold-
Автор

The word "Strawberry" actually has two R's. I apologize for any confusion caused earlier. - Chat GPT

dukestt
Автор

Out of all the people they have Mike is the best (IMO) it would be awesome to do a segment with him on how models like Stable Video Diffusion Image-to-Video work

mscotty
Автор

8:12 "Langchain does a lot of other stuff that I'm not using"...langchain in a nutshell

mokopa
Автор

I worked on a RAG to make product recommendations, but eventually I was supplying it with too much data as context and it wouldn't work.

I settled on a neat solution: use GPT's ability to call functions and tell it something like, "when the user asks for a recommendation, call the get_recommendations function with a summary of the user's query". It's cool that it gave me a summary because the embeddings are much better than those of a whole sentence or paragraph. So I could take that embedding and look up products based on semantic similarity to the user's query, while it was still generating a response, and then pass the top 10 back to GPT for it to show the user

alastairzotos
Автор

It's surprisingly bare bones as an approach. I was expecting something more sophisticated than just sticking the context as part of a promt and literally telling the model to use it in the answer. Reminds me of "promp engineers" sticking a _"and please don't lie"_ at the end of a prompt to decrease hallucinations 😂

Tomyb
Автор

I remember having a whole box of the green printer paper. A family friend worked at the state and gave it to me for drawing, etc. some of it had phone numbers and addresses. Long ago in the city dump now.

iabnrk
Автор

Feels illegal to be this early to Prof. Pound's lectures

KylerChin
Автор

Mike, if when you're finished your career in academia and if you find yourself bored, please considering starting a Youtube channel explaining literally anything vaguely related to computing!

neongensis
Автор

More than a few people have been saying recently that yes, most of the LLM output for generic questions is pretty rubbish, but now imagine what happens when most of what they are trained upon is also LLM output. Almost certainly the quality of any results is going to get exponentially worse, no?

BytebroUK
Автор

Doesn't RAG make an LLM more susceptible to prompt injection hijacking. If I can get an LLM to grab data that I control that itself includes prompt injection attacks, then RAG is giving me a way to possibly bypass some of the prompt sanitation that is built into the LLM general interface. The hacker WunderWuzzi seems to leverage these edge cases in a lot of his recent AI security research.

jimjones
Автор

Given that LLMs hold knowledge in their weights, but are also taking in knowledge through RAG I wonder how those things interact if the sources of knowledge conflict... Specifically what happens if an LLMs learned knowledge falls behind compared to stuff like Wikipedia that's updated constantly? Or on the opposite end, can you poison a model using RAG by deliberately feeding in bad knowledge as part of that additional context...

Imperial_Squid
Автор

Funny. just had to do this in a hackathon last week :)

garcipat
Автор

I'm a simple person.
I see Mike Pound, I click on the video.

frankbucciantini
Автор

It would be interesting to get a sense of how much the context helps. What -would- the answer have been without it? If I really did have context about things the model itself could not have learned, how does it do?

jameshiggins-thomas
Автор

9:22 Something about writing the initial prompt to the LLM in second person has always rubbed me the wrong way. Wouldn't it play a lot better to the strengths of an LLM to write a prompt like "below is a transcript of a conversation where a chatbot successfully answers a user's question" rather than a prompt like "You are an AI assistant who answers questions"?

I understand that instruction-tuned models are tuned to handle these second-person prompts, but it seems like a weird stopgap.

HeroOfHyla
Автор

I am JUST now studying this. This very uncanny computerphile

thecompanioncube
Автор

The fact this channel doesnt have daily uploads is sad af

Xjaychax