Building secure LLM applications

Показать описание

When LLM based systems work well and are deployed securely, they generate a lot of value for users and businesses. However, if these systems are compromised, they can quickly turn into PR and legal disasters.

Knowing the major attack vectors for hacking LLMs and remedies to address them, can help you to create more robust systems and deploy with confidence.

In the second installment of our series on working with LLMs we explain 27 of the most successful attack techniques based on a global prompt hacking competition. We also step through strategies for evaluating data privacy risks and red teaming your systems. Finally we explore remedies such as prompt based defenses, detectors and guardrails.

00:00 Why understanding risks and remedies is important
01:21 Prompt hacking
02:37 Prompt injection
03:51 Jailbreaking
05:45 Prompt hacking techniques
23:41 Data privacy
24:16 Discoverable memorisation
26:20 Extractable memorisation
27:42 Implications - discoverable and extractable
29:10 Remedies
29:21 Red teaming
32:43 Prompt based defenses
35:15 Detectors
36:32 Guardrails and supervision
39:00 Conclusion