Safe RAG for LLMs

preview_player
Показать описание

Large Language Models (LLMs) are pretty smart, but they don’t know everything. For example, an LLM might know why the sky is blue, but it probably doesn’t know more specific things, like which flight the user has booked. Many AI applications use Retrieval-Augmented Generation (RAG) to feed that sort of user-specific data to LLMs, so they can provide better answers.

However, malicious users can use specially engineered prompts to trick an LLM to reveal more data than intended. This gets especially dangerous if the LLM has access to databases through RAG. In this video, Wenxin Du shows Martin Omander how to make RAG safer and reduce the risk of an LLM leaking sensitive data that it gathered via RAG.

Chapters:
0:00 - Intro
1:15 - RAG
1:57 - Making RAG safer
3:11 - Architecture review
4:47 - Questions & Answers
5:47 - How to get started
6:09 - Wrap up

#ServerlessExpeditions #CloudRun

Speaker: Wenxin Du, Martin Omander
Products Mentioned: Cloud - Containers - Cloud Run, Generative AI - General
Рекомендации по теме
Комментарии
Автор

This is cool. The interesting part would be when we wanted to isolate the user data files (pdfs, docs etc). I hope you will have a part 2 for that.

babusivaprakasam
Автор

Love to see more details on this setup and more code examples of LLM + Lamda (cloud run). Maybe with 3rd party user authentication like
Okra and keycloak.

MichaelRoachDavid
Автор

I understand that you need at list 2 prompt: one to understand the questions and define the correct api call to perform, and the second to format the database answer in human readable output.

My question is for the first prompt: you need to train (one or few shot prompt) your llm with examples to let it know the available operation in the database API.
And so, any improvement in your API must be reported in the LLM prompt. Am I correct?

If so, any best practices, tools or tips to achieve it at scale?

guillaumeblaquiere
Автор

Will orchestrate layer would not let slow down the application ?

abdulbasit.tech
Автор

really awesome!, Can I do this with kubernetes GCP (with microservice to manage JWT)

dannycastro
Автор

Very useful! Thank you again Martin!
is there a version of this repo built with Node.js?

RicardonesM
Автор

intente entrenar google gemini pero no me dio resultado, es porque use la api gratuita? saludos

esarmiento
Автор

is retreival service a tool or something else can u explain please

growthm
Автор

Too high level talk. How can the authZ is tagged to the vector db data. Almost, one has to add authZ related controls in the DB during data retrievals.

qinlingzhou