Safe RAG for LLMs

Показать описание

Large Language Models (LLMs) are pretty smart, but they don’t know everything. For example, an LLM might know why the sky is blue, but it probably doesn’t know more specific things, like which flight the user has booked. Many AI applications use Retrieval-Augmented Generation (RAG) to feed that sort of user-specific data to LLMs, so they can provide better answers.

However, malicious users can use specially engineered prompts to trick an LLM to reveal more data than intended. This gets especially dangerous if the LLM has access to databases through RAG. In this video, Wenxin Du shows Martin Omander how to make RAG safer and reduce the risk of an LLM leaking sensitive data that it gathered via RAG.

Chapters:
0:00 - Intro
1:15 - RAG
1:57 - Making RAG safer
3:11 - Architecture review
4:47 - Questions & Answers
5:47 - How to get started
6:09 - Wrap up

#ServerlessExpeditions #CloudRun

Speaker: Wenxin Du, Martin Omander
Products Mentioned: Cloud - Containers - Cloud Run, Generative AI - General

Рекомендации по теме

Комментарии

This is cool. The interesting part would be when we wanted to isolate the user data files (pdfs, docs etc). I hope you will have a part 2 for that.

babusivaprakasam

Love to see more details on this setup and more code examples of LLM + Lamda (cloud run). Maybe with 3rd party user authentication like
Okra and keycloak.

MichaelRoachDavid

I understand that you need at list 2 prompt: one to understand the questions and define the correct api call to perform, and the second to format the database answer in human readable output.

My question is for the first prompt: you need to train (one or few shot prompt) your llm with examples to let it know the available operation in the database API.
And so, any improvement in your API must be reported in the LLM prompt. Am I correct?

If so, any best practices, tools or tips to achieve it at scale?

guillaumeblaquiere

Will orchestrate layer would not let slow down the application ?

abdulbasit.tech

really awesome!, Can I do this with kubernetes GCP (with microservice to manage JWT)

dannycastro

Very useful! Thank you again Martin!
is there a version of this repo built with Node.js?

RicardonesM

intente entrenar google gemini pero no me dio resultado, es porque use la api gratuita? saludos

esarmiento

is retreival service a tool or something else can u explain please

growthm

Too high level talk. How can the authZ is tagged to the vector db data. Almost, one has to add authZ related controls in the DB during data retrievals.

qinlingzhou

Safe RAG for LLMs

Safe RAG for LLMs

How RAG Turns AI Chatbots Into Something Practical

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Build a Large Language Model AI Chatbot using Retrieval Augmented Generation

Is RAG an essential component of the LLM stack? #llm #aisafety #artificialintelligence

Build safe and reliable LLM applications with guardrails in this new course

Should You Use Open Source Large Language Models?

Feed Your OWN Documents to a Local Large Language Model!

🔥 ModernBERT: The Next Generation of Language Encoders | Technical Deep Dive | Listen now

What is Retrieval Augmented Generation (RAG) - Augmenting LLMs with a memory

Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial)

How to set up RAG - Retrieval Augmented Generation (demo)

Building Safe and Secure LLM Applications Using NVIDIA NeMo Guardrails

Build your own RAG based LLM Application (Completely Offline!): AI for your documents

What is RAG? (Retrieval Augmented Generation)

Vector Database for GenAI and LLM Applications

Better Llama 2 with Retrieval Augmented Generation (RAG)

LLMs with 8GB / 16GB

Retrieval Augmented Generation explained for Beginners | RAG in LLMs

Guardrails for LLMs: A Practical Approach // Shreya Rajpal // LLMs in Prod Conference Part 2

Access control for RAG and LLMs

Hallucination is a top concern in LLM safety but broader AI safety issues lie beyond hallucinations

Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search

LangChain Parent-Child Retriever for better RAG