In the rush to build AI apps, don't leave security behind The Register

There are countless models, libraries, algorithms, pre-built tools, and packages to play with, and progress is relentless.
You'll typically glue together libraries, packages, training data, models, and custom source code to perform inference tasks.
Code components available from public repositories can contain hidden backdoors or data exfiltrators, and pre-built models and datasets can be poisoned to cause apps to behave unexpectedly inappropriately.
Some models can contain malware that is executed if their contents are not safely deserialized.
Bad packages could lead to developers' workstations being compromised, leading to damaging intrusions into corporate networks, and tampered-with models and training datasets could cause applications to wrongly classify things, offend users, and so on.
Backdoored or malware-spiked libraries and models, if incorporated into shipped software, could leave users of those apps open to attack as well.
The AI supply chain has numerous points of entry for criminals, who can use things like typosquatting to trick developers into using malicious copies of otherwise legit libraries, allowing the crooks to steal sensitive data and corporate credentials, hijack servers running the code, and more, it's argued.
To illustrate the potential danger, HiddenLayer the other week highlighted what it strongly believes is a security issue with an online service provided by Hugging Face that converts models in the unsafe Pickle format to the more secure Safetensors, also developed by Hugging Face.
Pickle models can contain malware and other arbitrary code that could be silently and unexpectedly executed when deserialized, which is not great.
Safetensors was created as a safer alternative: Models using that format should not end up running embedded code when deserialized.
For those who don't know, Hugging Face hosts hundreds of thousands of neural network models, datasets, and bits of code developers can download and use with just a few clicks or commands.
The Safetensors converter runs on Hugging Face infrastructure, and can be instructed to convert a PyTorch Pickle model hosted by Hugging Face to a copy in the Safetensors format.
HiddenLayer researchers said they found they could submit a conversion request for a malicious Pickle model containing arbitrary code, and during the transformation process, that code would be executed on Hugging Face's systems, allowing someone to start messing with the converter bot and its users.
We're told the converter bot's credentials could be accessed and leaked by code stashed in a Pickle model, allowing someone to masquerade as the bot and open pull requests for changes to other repositories.
How to weaponize LLMs to hijack websites Google open sources file-identifying Magika AI for malware hunters and others California proposes government cloud cluster to sift out nasty AI models OpenAI shuts down China, Russia, Iran, N Korea accounts caught doing naughty things.
This is more than a theoretical threat: Devops shop JFrog said it found malicious code hiding in 100 models hosted on Hugging Face.
There are, in truth, various ways to hide harmful payloads of code in models that - depending on the file format - are executed when the neural networks are loaded and parsed, allowing miscreants to gain access to people's machines.
Those security updates may not have made their way into the datasets used to train large language models to program, Bonner lamented.
Bonner urged the AI community to start implementing supply-chain security practices, such as requiring developers to digitally prove they are who they say they are when making changes to public code repositories, which would reassure folks that new versions of things were produced by legit devs and were not malicious changes.
Trying to beef up security in the AI supply chain is tricky, and with so many tools and models being built and released, it's difficult to keep up.


This Cyber News was published on go.theregister.com. Publication date: Sun, 17 Mar 2024 11:13:08 +0000


Cyber News related to In the rush to build AI apps, don't leave security behind The Register