DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed

According to cybersecurity firm Truffle Security, the study highlights how AI models trained on unfiltered internet snapshots risk internalizing and potentially reproducing insecure coding patterns. The tool differentiated live secrets (authenticated against their services) from inert strings—a critical step given that LLMs cannot discern valid credentials during training. The study underscores a growing dilemma: LLMs trained on publicly accessible data inherit its security flaws. While models like DeepSeek utilize additional safeguards fine-tuning, alignment techniques, and prompt constraints—the prevalence of hardcoded secrets in training corpora risks normalizing unsafe practices. The findings follow earlier revelations that LLMs frequently suggest hardcoding credentials in codebases, raising questions about the role of training data in reinforcing these behaviors. Truffle Security scanned 400 terabytes of Common Crawl’s December 2024 dataset, comprising 2.67 billion web pages from 47.5 million hosts. Truffle Security warns that developers who reuse API keys across client projects face heightened risks. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. Truffle Security deployed a 20-node AWS cluster to process the archive, splitting files using awk and scanning each segment with TruffleHog’s verification engine. Integrating security guardrails into AI coding tools via platforms like GitHub Copilot’s Custom Instructions, which can enforce policies against hardcoding secrets. Adopting Constitutional AI techniques to align models with security best practices, reducing inadvertent exposure of sensitive patterns. With LLMs increasingly shaping software development, securing their training data is no longer optional—it’s foundational to building a safer digital future. A recent analysis uncovered 11,908 live DeepSeek API keys, passwords, and authentication tokens embedded in publicly scraped web data. Gurubaran is a co-founder of Cyber Security News and GBHackers On Security. Notably, the dataset included high-risk exposures like AWS root keys in front-end HTML and 17 unique Slack webhooks hardcoded into a single webpage’s chat feature. Despite these challenges, the team prioritized ethical disclosure by collaborating with vendors like Mailchimp to revoke thousands of keys, avoiding spam-like outreach to individual website owners. Expanding secret-scanning programs to include archived web data as historical leaks resurface in training datasets. Common Crawl’s dataset, stored in 90,000 WARC files, preserves raw HTML, JavaScript, and server responses from crawled sites. Non-functional credentials (e.g., placeholder tokens) contribute to this issue, as LLMs cannot contextually evaluate their validity during code generation. He has 10+ years of experience as a Security Consultant, Editor, and Analyst in cybersecurity, technology, and communications.

This Cyber News was published on cybersecuritynews.com. Publication date: Fri, 28 Feb 2025 03:40:22 +0000


Cyber News related to DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed

How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog - After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how ...
6 months ago Aws.amazon.com
DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed - According to cybersecurity firm Truffle Security, the study highlights how AI models trained on unfiltered internet snapshots risk internalizing and potentially reproducing insecure coding patterns. The tool differentiated live secrets (authenticated ...
1 month ago Cybersecuritynews.com
Threat Actors Exploiting DeepSeek's Popularity To Deploy Malware - To safely navigate AI models like DeepSeek while minimizing phishing and malware risks, users should utilize Criminal IP’s IP analysis service to verify server locations and network security. Cyber attackers have been creating phishing websites ...
2 months ago Cybersecuritynews.com
South Korea Confirm DeepSeek Sending Data Chinese ByteDance Servers - The findings follow a technical audit revealing critical security flaws, including unencrypted data transfers, deprecated encryption protocols, and deliberate bypassing of Apple’s App Transport Security (ATS) safeguards. Data Sovereignty Concerns: ...
2 months ago Cybersecuritynews.com
Defining Good: A Strategic Approach to API Risk Reduction - A good API security strategy starts with a well thought out API security posture governance program that spans from design to deployment. That standard, if communicated and enforced effectively, will not only positively affect how a developer designs ...
1 year ago Securityboulevard.com
Weaponized Google Ads Attacking DeepSeek Users to Deliver Malware - The attack uses convincingly crafted fake advertisements that appear at the top of Google search results, mimicking legitimate DeepSeek ads but redirecting victims to malicious websites designed to distribute malware. Cybercriminals have launched a ...
3 weeks ago Cybersecuritynews.com
Android Malware Mimic As DeepSeek To Steal Users Login Credentials - The malware campaign uses a deceptive phishing website that closely mimics the official DeepSeek platform, tricking users into downloading a malicious application that steals login credentials and sensitive information. Once installed, the malicious ...
1 month ago Cybersecuritynews.com
DeepSeek R1 Jailbreaked To Develop Malware, Such As A Keylogger And Ransomware - Cyber Security News - These findings suggest that while DeepSeek R1 doesn’t provide turnkey malware solutions, it significantly lowers the technical barrier for creating harmful software, potentially accelerating malicious actors’ capabilities in developing ...
1 month ago Cybersecuritynews.com
Salt Security Delivers API Posture Governance Engine - PRESS RELEASE. PALO ALTO, Calif., Jan. 17, 2024 /PRNewswire/ - Salt Security, the leading API security company, today announced multiple advancements in discovery, posture management and AI-based threat protection to the industry leading Salt ...
1 year ago Darkreading.com
CVE-2023-38291 - An issue was discovered in a third-party component related to ro.boot.wifimacaddr, shipped on devices from multiple device manufacturers. Various software builds for the following TCL devices (30Z and 10L) and Motorola devices (Moto G Pure and Moto G ...
1 year ago
Enzoic for AD Lite Data Shows Increase in Crucial Risk Factors - The 2023 data from Enzoic for Active Directory Lite data from 2023 offers a revealing glimpse into the current state of cybersecurity, highlighting a significant increase in risk factors that lead to data breaches. The free password auditor has been ...
1 year ago Securityboulevard.com
Imperva Named an Overall Leader in the KuppingerCole Leadership Compass: API Security and Management Report - We're thrilled to share that Imperva has achieved the prestigious status of Overall Leader in the KuppingerCole Leadership Compass: API Security and Management report. A notable achievement is being recognized as one of the few non-gateway-first ...
1 year ago Imperva.com
San Francisco Police's Live Surveillance Yields Almost 200 Hours of Spying-Including of Music Festivals - A new report reveals that in just three months, from July 1 to September 30, 2023, the San Francisco Police Department racked up 193 hours and 19 minutes of live access to non-city surveillance cameras. That means for the equivalent of 8 days, police ...
1 year ago Eff.org
That time I broke into an API and became a billionaire - This included an internal API with a dependency on a third-party banking API. We'll get to the banking API later in this story. That's all thanks to developers embracing agile development, microservices, and API gateway redirection that exposed ...
1 year ago Securityboulevard.com
Fake Ledger Live app in Microsoft Store steals $768,000 in crypto - Microsoft has recently removed from its store a fraudulent Ledger Live app for cryptocurrency management after multiple users lost at least $768,000 worth of cryptocurrency assets. Published with the name Ledger Live Web3, the fake application ...
1 year ago Bleepingcomputer.com
Unified API Protection - A massive segment of organizations' digital footprint today is built around internal and external APIs. As more IT leaders realize and acknowledge the size of APIs' influence, it's become clear that new methods are needed to secure those APIs. While ...
2 years ago Cequence.ai
CVE-2023-38297 - An issue was discovered in a third-party com.factory.mmigroup component, shipped on devices from multiple device manufacturers. Certain software builds for various Android devices contain a vulnerable pre-installed app with a package name of ...
1 year ago
Trello API abused to link email addresses to 15 million accounts - An exposed Trello API allows linking private email addresses with Trello accounts, enabling the creation of millions of data profiles containing both public and private information. Trello is an online project management tool owned by Atlassian that ...
1 year ago Bleepingcomputer.com
DeepSeek Generating Fully Working Keyloggers & Data Exfiltration Tools - Security researchers at Unit 42 have successfully prompted DeepSeek, a relatively new large language model (LLM), to generate detailed instructions for creating keyloggers, data exfiltration tools, and other harmful content. The research findings ...
1 month ago Cybersecuritynews.com
Hugging Face API tokens exposed, major projects vulnerable The Register - The API tokens of tech giants Meta, Microsoft, Google, VMware, and more have been found exposed on Hugging Face, opening them up to potential supply chain attacks. Researchers at Lasso Security found more than 1,500 exposed API tokens on the open ...
1 year ago Go.theregister.com
CVE-2023-38298 - Various software builds for the following TCL devices (30Z, A3X, 20XE, 10L) leak the device IMEI to a system property that can be accessed by any local app on the device without any permissions or special privileges. Google restricted third-party ...
1 year ago
The most popular passwords of 2023 are easy to guess and crack - Each year, analysts at various Internet security companies release lists of the most used passwords. ADVERTISEMENT. The passwords that are on these lists may act as a warning for any Internet and electronic device user. Some common passwords have ...
1 year ago Ghacks.net
CVE-2023-38301 - An issue was discovered in a third-party component related to vendor.gsm.serial, shipped on devices from multiple device manufacturers. Various software builds for the BLU View 2, Boost Mobile Celero 5G, Sharp Rouvo V, Motorola Moto G Pure, Motorola ...
1 year ago
Exposed Hugging Face API tokens jeopardized GenAI models - Lasso Security researchers discovered 1,681 Hugging Face API tokens exposed in code repositories, which left vendors such as Google, Meta, Microsoft and VMware open to potential supply chain attacks. In a blog post published Monday, Lasso Security ...
1 year ago Techtarget.com
Over 12 million auth secrets and keys leaked on GitHub in 2023 - GitHub users accidentally exposed 12.8 million authentication and sensitive secrets in over 3 million public repositories during 2023, with the vast majority remaining valid after five days. The exposed secrets include account passwords, API keys, ...
1 year ago Bleepingcomputer.com

Latest Cyber News


Cyber Trends (last 7 days)