How AI can be hacked with prompt injection: NIST report

As AI proliferates, so does the discovery and exploitation of AI cybersecurity vulnerabilities.
Prompt injection is one such vulnerability that specifically attacks generative AI. In Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations, NIST defines various adversarial machine learning tactics and cyberattacks, like prompt injection, and advises users on how to mitigate and manage them.
AML tactics extract information about how machine learning systems behave to discover how they can be manipulated.
That information is used to attack AI and its large language models to circumvent security, bypass safeguards and open paths to exploit.
NIST defines two prompt injection attack types: direct and indirect.
With direct prompt injection, a user enters a text prompt that causes the LLM to perform unintended or unauthorized actions.
An indirect prompt injection is when an attacker poisons or degrades the data that an LLM draws from.
One of the best-known direct prompt injection methods is DAN, Do Anything Now, a prompt injection used against ChatGPT. DAN uses roleplay to circumvent moderation filters.
In its first iteration, prompts instructed ChatGPT that it was now DAN. DAN could do anything it wanted and should pretend, for example, to help a nefarious person create and detonate explosives.
Indirect prompt injection, as NIST notes, depends on an attacker being able to provide sources that a generative AI model would ingest, like a PDF, document, web page or even audio files used to generate fake voices.
Indirect prompt injection is widely believed to be generative AI's greatest security flaw, without simple ways to find and fix these attacks.
Explore AI cybersecurity solutions How to stop prompt injection attacks.
These attacks tend to be well hidden, which makes them both effective and hard to stop.
For model creators, NIST suggests ensuring training datasets are carefully curated.
They also suggest training the model on what types of inputs signal a prompt injection attempt and training on how to identify adversarial prompts.
For indirect prompt injection, NIST suggests human involvement to fine-tune models, known as reinforcement learning from human feedback.
RLHF helps models align better with human values that prevent unwanted behaviors.
NIST further suggests using LLM moderators to help detect attacks that don't rely on retrieved sources to execute.
Finally, NIST proposes interpretability-based solutions.
Learn more about how IBM Security delivers AI cybersecurity solutions that strengthen security defenses.


This Cyber News was published on securityintelligence.com. Publication date: Tue, 19 Mar 2024 14:13:05 +0000


Cyber News related to How AI can be hacked with prompt injection: NIST report

How AI can be hacked with prompt injection: NIST report - As AI proliferates, so does the discovery and exploitation of AI cybersecurity vulnerabilities. Prompt injection is one such vulnerability that specifically attacks generative AI. In Adversarial Machine Learning: A Taxonomy and Terminology of Attacks ...
8 months ago Securityintelligence.com
Forget Deepfakes or Phishing: Prompt Injection is GenAI's Biggest Problem - Cybersecurity professionals and technology innovators need to be thinking less about the threats from GenAI and more about the threats to GenAI from attackers who know how to pick apart the design weaknesses and flaws in these systems. Chief among ...
9 months ago Darkreading.com
Accelerating Safe and Secure AI Adoption with ATO for AI: stackArmor Comments on OMB AI Memo - We appreciate the opportunity to comment on the proposed Memo on Agency Use of Artificial Intelligence. Ensuring agencies have access to adequate IT infrastructure,. We base our remarks on our experience helping US Federal agencies transform their ...
11 months ago Securityboulevard.com
CMMC v2.0 vs NIST 800-171: Understanding the Differences - The NIST SP 800-171 lays out the requirements for any non-federal agency that handles controlled unclassified information, or other sensitive federal information. DFARS does not address the CMMC at all but a new clause is currently being drafted for ...
10 months ago Securityboulevard.com
NIST Fortifies Chatbots and Self-Driving Cars Against Digital Threats - In a landmark move, the US National Institute of Standards and Technology has taken a new step in developing strategies to fight against cyber-threats that target AI-powered chatbots and self-driving cars. The Institute released a new paper on ...
10 months ago Infosecurity-magazine.com
What is the NIST Cybersecurity Framework? Definition from SearchSecurity - The NIST Cybersecurity Framework provides guidance on how to manage and reduce IT infrastructure security risk. NIST created the CSF to help private sector organizations in the United States develop a roadmap for critical infrastructure ...
10 months ago Techtarget.com
US SEC's X account hacked to announce fake Bitcoin ETF approval - The X account for the U.S. Securities and Exchange Commission was hacked today to issue a fake announcement on the approval of Bitcoin ETFs on security exchanges. The announcement came this afternoon in a now-deleted tweet from the SEC's hacked X ...
10 months ago Bleepingcomputer.com
NIST: No Silver Bullet Against Adversarial Machine Learning Attacks - NIST has published a report on adversarial machine learning attacks and mitigations, and cautioned that there is no silver bullet for these types of threats. Adversarial machine learning, or AML, involves extracting information about the ...
10 months ago Securityweek.com
OWASP Top 10 for LLM Applications: A Quick Guide - Even still, the expertise and insights provided, including prevention and mitigation techniques, are highly valuable to anyone building or interfacing with LLM applications. Prompt injections are maliciously crafted inputs that lead to an LLM ...
7 months ago Securityboulevard.com
SEC confirms X account was hacked in SIM swapping attack - The U.S. Securities and Exchange Commission confirmed today that its X account was hacked through a SIM-swapping attack on the cell phone number associated with the account. Earlier this month, the SEC's X account was hacked to issue a fake ...
10 months ago Bleepingcomputer.com
What's new in the MSRC Report Abuse Portal and API - The Microsoft Security Response Center has always been at the forefront of addressing cyber threats, privacy issues, and abuse arising from Microsoft Online Services. Building on our commitment, we have introduced several key updates to the Report ...
4 months ago Msrc.microsoft.com
Mandiant's X account hacked by crypto Drainer-as-a-Service gang - The threat actor who took over Mandiant's X social media account used it to share links, redirecting the company's over 123,000 followers to a phishing page to steal cryptocurrency. As Mandiant found during a follow-up investigation into the ...
10 months ago Bleepingcomputer.com
Preparing for Q-Day as NIST nears approval of PQC standards - Q-Day-the day when a cryptographically relevant quantum computer can break most forms of modern encryption-is fast approaching, leaving the complex systems our societies rely on vulnerable to a new wave of cyberattacks. While estimates just a few ...
4 months ago Helpnetsecurity.com
Preparing for Q-Day as NIST nears approval of PQC standards - Q-Day-the day when a cryptographically relevant quantum computer can break most forms of modern encryption-is fast approaching, leaving the complex systems our societies rely on vulnerable to a new wave of cyberattacks. While estimates just a few ...
4 months ago Helpnetsecurity.com
The US National Institute of Standards and Technology Announces the Successful Encryption Algorithm for Securing Internet of Things Data - The National Institute of Standards and Technology (NIST) recently announced that ASCON was the winning bid for its Lightweight Cryptography Program. This program was designed to find the best algorithm to protect small Internet of Things (IoT) ...
1 year ago Bleepingcomputer.com
NIST Confusion Continues as Cyber Pros Complain CVE Uploads Stopped - A recent rise in software vulnerability exploits has come as the US National Vulnerability Database, the world's most comprehensive vulnerability database, experiences its most significant crisis in history. After experiencing a vulnerability ...
6 months ago Infosecurity-magazine.com
Google Cloud Report Spotlights 2024 Cybersecurity Challenges - As the New Year dawns, a cybersecurity report from Google Cloud suggests that while there are many challenges ahead, it will also become simpler for cybersecurity teams to leverage artificial intelligence to better defend IT environments. John ...
10 months ago Securityboulevard.com
How the New NIST 2.0 Guidelines Help Detect SaaS Threats - The SaaS ecosystem has exploded in the six years since the National Institute of Standards and Technology's cybersecurity framework 1.1 was released. Back in 2016-2017, when version 1.1 was initially drafted, SaaS held a small but significant place ...
8 months ago Bleepingcomputer.com
LLMs Open to Manipulation Using Doctored Images, Audio - Such attacks could become a major issue as LLMs become increasingly multimodal or are capable of responding contextually to inputs that combine text, audio, pictures, and even video. Hiding Instructions in Images and Audio At Black Hat Europe 2023 ...
11 months ago Darkreading.com
The impact of prompt injection in LLM agents - This risk is particularly alarming when LLMs are turned into agents that interact directly with the external world, utilizing tools to fetch data or execute actions. Malicious actors can leverage prompt injection techniques to generate unintended and ...
11 months ago Helpnetsecurity.com
Vanta announces new offerings to meet the needs of modern GRC and security leaders - Vanta announced a number of new and upcoming product launches enabling customers to accelerate innovation and strengthen security. The new offerings include advanced Reporting to help security professionals measure the success of their security ...
11 months ago Helpnetsecurity.com
UAC Bypass: 3 Methods Used Malware In Windows 11 in 2024 - User Account Control is one of the security measures introduced by Microsoft to prevent malicious software from executing without the user's knowledge. Modern malware has found effective ways to bypass this barrier and ensure silent deployment on the ...
5 months ago Cybersecuritynews.com
5 Lessons Learned from Windows Remote Desktop Honeypot Report - Recently, the SANS Institute released their annual Windows Remote Desktop Honeypot Report, providing comprehensive insights into the nature of malicious activity in a Windows environment. In order to understand how your own Windows network can be ...
1 year ago Bleepingcomputer.com
Microsoft SFI progress report elicits cautious optimism | TechTarget - "After a year, it looks like Microsoft has made some smart and substantive initial progress in elevating security across the whole organization: investment in security-focused head count, inclusion of security into performance reports across the ...
1 month ago Techtarget.com
CVE-2023-22499 - Deno is a runtime for JavaScript and TypeScript that uses V8 and is built in Rust. Multi-threaded programs were able to spoof interactive permission prompt by rewriting the prompt to suggest that program is waiting on user confirmation to unrelated ...
1 year ago

Latest Cyber News


Cyber Trends (last 7 days)


Trending Cyber News (last 7 days)