Security flaws found in both Hugging Face and GitHub repositories exposed almost 1,700 API tokens, opening up AI developers to supply chain and other attacks and putting a brighter spotlight on the need to ensure that security keeps up with the accelerating pace of innovation of AI and large-language models.
In a report today, researchers with startup Lasso Security found more than 1,500 exposed APIs on the Hugging Face platform - essentially GitHub for the AI set - that allowed them to access the accounts of 723 organizations, including such companies as Microsoft, Google, Meta, and VMware.
Among those accounts, 655 users' tokens had write permissions - 77 to different organizations - that granted the researchers full control of the repositories of prominent companies.
Along with the supply-chain threat the exposed APIs represented, they also opened up to the possibility of bad actors poisoning training data.
Lasso researchers obtained access to 14 datasets with that can see hundreds of thousands of downloads a month.
The researchers could have stolen more than 10,000 private AI models that were linked to more than 2,500 datasets.
Lanyado told Security Boulevard that he expected the company's research would return some vulnerabilities, but the results surprised him.
We were able to access nearly all of the top technology companies' tokens, and gain full control over some of them.
Major companies like Meta, Microsoft, and Google take pride in their security capabilities but still were unaware of the significant third-party risk, he added.
This awareness ensures that these technological strides align with the company's business [and] also security objectives.
The company focuses on cybersecurity for LLMs. A key Hugging Face asset is its open-source Transformers library, he wrote, which holds more than 500,000 AI models and 250,000 datasets, including the Meta-Llama, Bloom, and Pythia models.
The researchers ran into some roadblocks when they started searching for APIs in both Hugging Face and GitHub, but were able to dig deeper through increasingly detailed searches.
They then used the Hugging Face whoami API to ensure the validity of the token and such information as the token's user and the user's email, organization memberships, and permissions, and the token's permissions and privileges, Lanyado wrote.
Hugging Face had announced the org api tokens were deprecated and were blocked in its Python library.
While the write functionality didn't work, in some instances the read functionality did, and they were able to download private models with an exposed org api token, such as with Microsoft.
The company contacted Hugging Face and all the organizations and users involved after running the research.
Hugging Face fixed the vulnerability while many of the companies - including Meta, Google, Microsoft, and VMware - revoked the vulnerable tokens and removed the public access token code.
He also said developers should understand that Hugging Face and similar platforms aren't secure enough, so responsibility for security will fall on developers and other users.
They also shouldn't work with hard-coded tokens to avoid having to constantly verify every commit that no tokens or sensitive information is pushed the repositories.
Lanyado pointed to other reports about security problems with AI, including Nvidia discovering three flaws in LangChain chains and Rezilion finding dangerous workflow patterns in the LLM open-source ecosystem.
This Cyber News was published on securityboulevard.com. Publication date: Mon, 04 Dec 2023 19:13:16 +0000