NIST has published a report on adversarial machine learning attacks and mitigations, and cautioned that there is no silver bullet for these types of threats.
Adversarial machine learning, or AML, involves extracting information about the characteristics and behavior of a machine learning system, and manipulating inputs in order to obtain a desired outcome.
NIST has published guidance documenting the various types of attacks that can be used to target artificial intelligence systems, warning AI developers and users that there is currently no foolproof method for protecting such systems.
The agency has encouraged the community to attempt to find better defenses.
The report, titled 'Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations', covers both predictive and generative AI. The former focuses on creating new content, while the latter uses historical data to forecast future outcomes.
NIST's report, authored in collaboration with representatives of Northeastern University and Robust Intelligence Inc, focuses on four main types of attacks: evasion, poisoning, privacy, and abuse.
In the case of evasion attacks, which involve altering an input to change the system's response, NIST provides an attack on autonomous vehicles as an example, such as creating confusing lane markings that could cause a car to veer off the road. In a poisoning attack, the attacker attempts to introduce corrupted data during the AI's training.
Getting a chatbot to use inappropriate language by planting numerous instances of such language into conversation records in an effort to get the AI to believe that it's common parlance.
Attackers can also attempt to compromise legitimate training data sources in what NIST describes as abuse attacks.
In privacy attacks, threat actors attempt to obtain valuable data about the AI or its training data by asking the chatbot numerous questions and using the provided answers to reverse engineer the model and find weaknesses.
This Cyber News was published on www.securityweek.com. Publication date: Mon, 08 Jan 2024 14:43:05 +0000