Are threat actors, or Malicious Rogue AI, targeting your AI systems to create subverted Rogue AI? Are they targeting your enterprise in general? And are they using your resources, their own, or a proxy whose AI has been subverted. The truth is that these subverted Rogue AI systems are themselves TTPs: agentic systems can carry out any of the ATT&CK tactics and techniques (e.g., Reconnaissance, Resource Development, Initial Access, ML Model Access, Execution) for any Impact. Rogue AI is related to all of the Top 10 large language model (LLM) risks highlighted by OWASP, except perhaps for LLM10: Model Theft, which signifies “unauthorized access, copying, or exfiltration of proprietary LLM models.” There is also no vulnerability associated with “misalignment”—i.e., when an AI has been compromised or is behaving in an unintended manner. Malicious Rogues could theoretically also try to subvert existing AI systems to go rogue, or be designed to produce “offspring”—although, at present, we consider humans to be the main intentional cause of Rogue AI. Humans and AI systems can both accidentally cause Rogue AI, while Malicious Rogues are, by design, intended to attack. That means pre- and post-deployment evaluation of systems and alignment checks to catch malicious, subverted or accidental Rogue AIs. And while there’s some good work going on in the security community to better profile these threats, what’s missing in Rogue AI is an approach which includes both causality and attack context. There’s been no example to date of attackers installing malicious AI systems in target environments, although it’s surely only a matter of time: as organizations begin adopting agentic AI so will threat actors. However, Prompt Injection, Jailbreak and Model Poisoning, which are all ATLAS TTPs, can be used to subvert AI systems and thereby create Rogue AI. OWASP does well in the Top 10 at suggesting mitigations for Rogue AI (which is a subject we’ll return to), but doesn’t deal with causality: i.e., whether an attack is intentional or not. Although ATLAS extends the ATT&CK framework to AI systems, it doesn’t address Rogue AI directly. Although MITRE ATLAS and ATT&CK deal with Subverted Rogues, they do not yet address Malicious Rogue AI. Intent is particularly useful in understanding Rogue AI, although it’s only covered elsewhere in the OWASP Security and Governance Checklist. Intent is key here: there’s plenty of ways for accidental Rogue AI to cause harm, with no attacker present. MIT divides risks into seven key groups and 23 subgroups, with Rogue AI directly addressed in the “AI System Safety, Failures and Limitations” domain. Risk models should be updated to take account of the threat from Rogue AI. Fortunately, only sophisticated actors can currently subvert AI systems for their specific goals, but the fact that they’re already checking for access to such systems should be concerning. Who the risk is caused by can also be helpful in analyzing Rogue AI threats. Accidental risk often stems from a weakness rather than a MITRE ATLAS attack technique or an OWASP vulnerability. By addressing this gap, we can start to plan for and mitigate Rogue AI risk comprehensively.
This Cyber News was published on www.trendmicro.com. Publication date: Thu, 03 Oct 2024 08:43:09 +0000