Currently, there is no infallible method to safeguard AI against misdirection, partly because the datasets necessary to train an AI are just too big for humans to effectively monitor and filter.
Computer scientists at the National Institute of Standards and Technology and their collaborators have identified these and other AI vulnerabilities and mitigation measures targeting AI systems.
This new report outlines the types of attacks its AI solutions could face and accompanying mitigation strategies to support the developer community.
Compounding the problem are zero-day vulnerabilities like the MOVEit SQLi, Zimbra XSS, and 300+ such vulnerabilities that get discovered each month.
It also classifies them based on various characteristics, including the attacker's goals and objectives, capabilities, and knowledge.
Attackers using evasion techniques try to modify an input to affect how an AI system reacts to it after deployment.
Some examples would be creating confusing lane markings to cause an autonomous car to veer off the road or adding markings to stop signs to cause them to be mistakenly read as speed limit signs.
By injecting corrupted data during the training process, poisoning attacks take place.
Adding multiple instances of inappropriate language to conversation records could be one way to trick a chatbot into thinking that the language is sufficiently prevalent for it to use in real customer interactions.
Attacks on privacy during deployment are attempts to obtain private information about the AI or the data it was trained on to abuse it.
An adversary can pose many valid questions to a chatbot and then utilize the responses to reverse engineer the model to identify its vulnerabilities or speculate where it came from.
It can be challenging to get the AI to unlearn those particular undesirable instances after the fact, and adding undesirable examples to those internet sources could cause the AI to perform badly.
In an abuse attack, incorrect data is introduced into a source-a webpage or online document, for example-which an AI receives.
Abuse attacks aim to provide the AI with false information from an actual but corrupted source to repurpose the AI system for its intended purpose.
With little to no prior knowledge of the AI system and limited adversarial capabilities, most attacks are relatively easy to launch.
This Cyber News was published on cybersecuritynews.com. Publication date: Tue, 09 Jan 2024 11:00:12 +0000