Securing AI: Red Teaming & Attack Strategies for Machine Learning Systems

Показать описание

Welcome to "MLSecOps Connect: Ask the Experts," an educational live stream series from the MLSecOps Community where attendees have the opportunity to hear their own questions answered by a variety of insightful guest speakers.

This is a recording of the session we held on October 17, 2024 with Johann Rehberger. During this session, Johann answered questions from the MLSecOps Community related to securing AI and machine learning (ML) systems, focusing on red teaming and attack strategies.

Explore with us as Johann answers:

- The big question we all want to know: how does Johann define the term “AI Red Teaming;” a term that's been highly debated within the industry recently?
- How can "traditional" red teamers & penetration testers adapt some of their current processes to the world of cybersecurity for AI/ML?
- How are LLMs uniquely challenging to red teamers, compared to conventional AI/ML models? Are there specific red teaming strategies you recommend for LLMs?
- Can you walk us through some of the more creative or less-discussed attack vectors that you've encountered while testing ML systems/LLMs?
- Do you have any predictions about how the threat of prompt injection will evolve as LLMs become more widely adopted?
- Since prompt filters don't work well on semantic attacks or epistemological attacks on models, What are ways to deal with these types of Zero Day threats?
- Have you seen Homoglyphs used in the wild or used Homoglyphs in your research to test limits?
- Have you noticed any advancements in adversarial attacks recently? If so, how can we better prepare for them?
- How would you (comparatively) view the frequency of tests against models as opposed to surrounding systems (for example RAG architectures, ...)?
- What are the most common vulnerabilities you find in AI and ML systems?
- In your experience, how frequently do attacks designed for one ML model successfully transfer to other models in the same environment? Any related precautions you’d recommend that organizations take?
- What kind of assessments have you already done?
- What monitoring strategy do you recommend?
- Is it possible to have a reliable real-time monitoring strategy at a reasonable cost?
- How do you carry out the evaluations for this?
- How do you feel about assessing AI risks (models and systems) with existing methods like CVSS?
- In security, firewalls have been known to have lots of false alarms, how do you see the AI guardrails/firewall working in the case of modern day agentic AI applications, where the real attacks are actually chained rather than a single-point prompt injection?
- What resources have you used to progress in this field/what resources would you recommend to the audience?
- Plus much more!

Thanks for joining us!