5 LLM Security Threats- The Future of Hacking?

preview_player
Показать описание
5 LLM Security Threats- The Future of Hacking?

👊 Become a member and get access to GitHub:

Get a FREE 45+ ChatGPT Prompts PDF here:
📧 Join the newsletter:

🌐 My website:

Andrej K:

Today we check what could be the future of hacking and LLM attacks with Jailbreaks and Prompt Injections on LLMs and Multimodal Models

00:00 LLM Attacks Intro
00:18 Prompt Injection Attacks
07:39 Jailbreak Attacks
Рекомендации по теме
Комментарии
Автор

00:28 🛡 Prompt Injection Attack: A technique for large language models (LLMs) allowing attackers to manipulate model output via carefully crafted prompts, potentially accessing sensitive data or executing unauthorized functions.
01:39 🌐 Prompt Injection Example: Demonstrates injecting hidden instructions into web content, manipulating the model's output when interacting with scraped data.
03:42 🖼 Image-based Prompt Injection: Embedding instructions within an image, prompting the model to generate specific responses when processing visual content.
04:47 🔍 Hidden Instructions in Images: Obscuring prompts within images, exploiting the model's response to generate unexpected links or content.
06:22 📰 Prompt Injection via Search Results: Demonstrates how search engine responses can carry manipulated instructions, potentially leading to malicious actions.
07:43 🛠 Jailbreaks on LLMs: Techniques involve manipulating or redirecting the initial prompts of LLMs to generate unintended content, either through prompt or token level jailbreaks.
08:38 🕵‍♂ Token-based Jailbreak Example: Exploiting Base64 encoding to manipulate prompts and generate unexpected responses from the model.
09:49 🐟 Fishing Email Jailbreak: Using encoded prompts to coax the model into generating potentially malicious email content, exploiting its response.
11:37 🐼 Image-based Jailbreak: Demonstrating how carefully designed noise patterns in images can prompt the model to generate unintended responses, posing a new attack surface.
13:29 🔒 Growing Security Concerns: Highlighting the potential escalation of security threats as reliance on LLMs and multimodal models increases, emphasizing the need for a robust security approach.

dameanvil
Автор

🎯 Key Takeaways for quick navigation:

00:00 🧐 *Prompt injection attack is a new technique for manipulating large language models (LLMs) using carefully crafted prompts to make them ignore instructions or perform unintended actions, potentially revealing sensitive data or executing unauthorized functions.*
01:24 📝 *Examples of prompt injection include manipulating websites to execute specific instructions and crafting images or text to influence LLM responses, potentially leading to malicious actions.*
05:25 🚧 *Prompt injection can also involve hiding instructions in images, leading to unexpected behaviors when processed by LLMs, posing security risks.*
07:43 🔒 *Jailbreak attacks manipulate or hijack LLMs' initial prompts to direct them towards malicious actions, including prompt-level and token-level jailbreaks.*
10:03 💻 *Base64 encoding can be used to create malicious prompts that manipulate LLM responses, even when the model is not supposed to provide such information, potentially posing security threats.*
11:37 🐼 *Jailbreaks can involve introducing noise patterns into images, leading to unexpected LLM responses and posing new attack surfaces on multimodal models, such as those handling images and text.*

Made with HARPA AI

zight
Автор

Thanks for keeping us up to date with understandable examples

robboerman
Автор

Until Sunday, what should I do? Okay, I'll soak up this stuff for now. Thanks Kris

GrigoriyMa
Автор

Greate video!

Where do you found this scrapping python tool? did you created?

EricoPanazzolo
Автор

Prompt injection: if anyone develops a website and implement code or content that is use to query or generate an output in the front end. They should not be writing code. That’s like putting and hiding sql in or API keys in the front end.

bladestarX
Автор

can you please make a video about hands-on comparing the new Gemini Pro(Bard) vs GPT3.5 vs GPT4? I am looking for a straight up comparison with real examples but everyone just uses the edited hand picked marketing material which is useless

orbedus
Автор

Heya man, was wondering if you could please do an updated whisper tutorial, please? just one on getting full transcripts with the python code 😀

silentphil
Автор

Thanks for keeping us up to date with understandable examples

enkor