DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed

According to cybersecurity firm Truffle Security, the study highlights how AI models trained on unfiltered internet snapshots risk internalizing and potentially reproducing insecure coding patterns. The tool differentiated live secrets (authenticated against their services) from inert strings—a critical step given that LLMs cannot discern valid credentials during training. The study underscores a growing dilemma: LLMs trained on publicly accessible data inherit its security flaws. While models like DeepSeek utilize additional safeguards fine-tuning, alignment techniques, and prompt constraints—the prevalence of hardcoded secrets in training corpora risks normalizing unsafe practices. The findings follow earlier revelations that LLMs frequently suggest hardcoding credentials in codebases, raising questions about the role of training data in reinforcing these behaviors. Truffle Security scanned 400 terabytes of Common Crawl’s December 2024 dataset, comprising 2.67 billion web pages from 47.5 million hosts. Truffle Security warns that developers who reuse API keys across client projects face heightened risks. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. Truffle Security deployed a 20-node AWS cluster to process the archive, splitting files using awk and scanning each segment with TruffleHog’s verification engine. Integrating security guardrails into AI coding tools via platforms like GitHub Copilot’s Custom Instructions, which can enforce policies against hardcoding secrets. Adopting Constitutional AI techniques to align models with security best practices, reducing inadvertent exposure of sensitive patterns. With LLMs increasingly shaping software development, securing their training data is no longer optional—it’s foundational to building a safer digital future. A recent analysis uncovered 11,908 live DeepSeek API keys, passwords, and authentication tokens embedded in publicly scraped web data. Gurubaran is a co-founder of Cyber Security News and GBHackers On Security. Notably, the dataset included high-risk exposures like AWS root keys in front-end HTML and 17 unique Slack webhooks hardcoded into a single webpage’s chat feature. Despite these challenges, the team prioritized ethical disclosure by collaborating with vendors like Mailchimp to revoke thousands of keys, avoiding spam-like outreach to individual website owners. Expanding secret-scanning programs to include archived web data as historical leaks resurface in training datasets. Common Crawl’s dataset, stored in 90,000 WARC files, preserves raw HTML, JavaScript, and server responses from crawled sites. Non-functional credentials (e.g., placeholder tokens) contribute to this issue, as LLMs cannot contextually evaluate their validity during code generation. He has 10+ years of experience as a Security Consultant, Editor, and Analyst in cybersecurity, technology, and communications.

This Cyber News was published on cybersecuritynews.com. Publication date: Fri, 28 Feb 2025 03:40:22 +0000


Cyber News related to DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed

How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog - After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how ...
4 months ago Aws.amazon.com
DeepSeek Data Leak - 12,000 Hardcoded Live API keys and Passwords Exposed - According to cybersecurity firm Truffle Security, the study highlights how AI models trained on unfiltered internet snapshots risk internalizing and potentially reproducing insecure coding patterns. The tool differentiated live secrets (authenticated ...
4 hours ago Cybersecuritynews.com
Threat Actors Exploiting DeepSeek's Popularity To Deploy Malware - To safely navigate AI models like DeepSeek while minimizing phishing and malware risks, users should utilize Criminal IP’s IP analysis service to verify server locations and network security. Cyber attackers have been creating phishing websites ...
2 weeks ago Cybersecuritynews.com
South Korea Confirm DeepSeek Sending Data Chinese ByteDance Servers - The findings follow a technical audit revealing critical security flaws, including unencrypted data transfers, deprecated encryption protocols, and deliberate bypassing of Apple’s App Transport Security (ATS) safeguards. Data Sovereignty Concerns: ...
1 week ago Cybersecuritynews.com
Defining Good: A Strategic Approach to API Risk Reduction - A good API security strategy starts with a well thought out API security posture governance program that spans from design to deployment. That standard, if communicated and enforced effectively, will not only positively affect how a developer designs ...
1 year ago Securityboulevard.com
Salt Security Delivers API Posture Governance Engine - PRESS RELEASE. PALO ALTO, Calif., Jan. 17, 2024 /PRNewswire/ - Salt Security, the leading API security company, today announced multiple advancements in discovery, posture management and AI-based threat protection to the industry leading Salt ...
1 year ago Darkreading.com
CVE-2023-38291 - An issue was discovered in a third-party component related to ro.boot.wifimacaddr, shipped on devices from multiple device manufacturers. Various software builds for the following TCL devices (30Z and 10L) and Motorola devices (Moto G Pure and Moto G ...
10 months ago
Imperva Named an Overall Leader in the KuppingerCole Leadership Compass: API Security and Management Report - We're thrilled to share that Imperva has achieved the prestigious status of Overall Leader in the KuppingerCole Leadership Compass: API Security and Management report. A notable achievement is being recognized as one of the few non-gateway-first ...
1 year ago Imperva.com
Enzoic for AD Lite Data Shows Increase in Crucial Risk Factors - The 2023 data from Enzoic for Active Directory Lite data from 2023 offers a revealing glimpse into the current state of cybersecurity, highlighting a significant increase in risk factors that lead to data breaches. The free password auditor has been ...
1 year ago Securityboulevard.com
San Francisco Police's Live Surveillance Yields Almost 200 Hours of Spying-Including of Music Festivals - A new report reveals that in just three months, from July 1 to September 30, 2023, the San Francisco Police Department racked up 193 hours and 19 minutes of live access to non-city surveillance cameras. That means for the equivalent of 8 days, police ...
1 year ago Eff.org
That time I broke into an API and became a billionaire - This included an internal API with a dependency on a third-party banking API. We'll get to the banking API later in this story. That's all thanks to developers embracing agile development, microservices, and API gateway redirection that exposed ...
1 year ago Securityboulevard.com
Unified API Protection - A massive segment of organizations' digital footprint today is built around internal and external APIs. As more IT leaders realize and acknowledge the size of APIs' influence, it's become clear that new methods are needed to secure those APIs. While ...
2 years ago Cequence.ai
Fake Ledger Live app in Microsoft Store steals $768,000 in crypto - Microsoft has recently removed from its store a fraudulent Ledger Live app for cryptocurrency management after multiple users lost at least $768,000 worth of cryptocurrency assets. Published with the name Ledger Live Web3, the fake application ...
1 year ago Bleepingcomputer.com
CVE-2023-38297 - An issue was discovered in a third-party com.factory.mmigroup component, shipped on devices from multiple device manufacturers. Certain software builds for various Android devices contain a vulnerable pre-installed app with a package name of ...
10 months ago
Trello API abused to link email addresses to 15 million accounts - An exposed Trello API allows linking private email addresses with Trello accounts, enabling the creation of millions of data profiles containing both public and private information. Trello is an online project management tool owned by Atlassian that ...
1 year ago Bleepingcomputer.com
Hugging Face API tokens exposed, major projects vulnerable The Register - The API tokens of tech giants Meta, Microsoft, Google, VMware, and more have been found exposed on Hugging Face, opening them up to potential supply chain attacks. Researchers at Lasso Security found more than 1,500 exposed API tokens on the open ...
1 year ago Go.theregister.com
CVE-2023-38298 - Various software builds for the following TCL devices (30Z, A3X, 20XE, 10L) leak the device IMEI to a system property that can be accessed by any local app on the device without any permissions or special privileges. Google restricted third-party ...
10 months ago
The most popular passwords of 2023 are easy to guess and crack - Each year, analysts at various Internet security companies release lists of the most used passwords. ADVERTISEMENT. The passwords that are on these lists may act as a warning for any Internet and electronic device user. Some common passwords have ...
1 year ago Ghacks.net
CVE-2023-38301 - An issue was discovered in a third-party component related to vendor.gsm.serial, shipped on devices from multiple device manufacturers. Various software builds for the BLU View 2, Boost Mobile Celero 5G, Sharp Rouvo V, Motorola Moto G Pure, Motorola ...
10 months ago
Exposed Hugging Face API tokens jeopardized GenAI models - Lasso Security researchers discovered 1,681 Hugging Face API tokens exposed in code repositories, which left vendors such as Google, Meta, Microsoft and VMware open to potential supply chain attacks. In a blog post published Monday, Lasso Security ...
1 year ago Techtarget.com
Over 12 million auth secrets and keys leaked on GitHub in 2023 - GitHub users accidentally exposed 12.8 million authentication and sensitive secrets in over 3 million public repositories during 2023, with the vast majority remaining valid after five days. The exposed secrets include account passwords, API keys, ...
11 months ago Bleepingcomputer.com
7 Essential Practices for Secure API Development - The necessity for API security cannot be overstated. Authentication and Authorization Authentication and authorization form the cornerstone of secure API interactions. In the world of API security, managing identities accurately ensures that only ...
11 months ago Feeds.dzone.com
What's Coming to Cisco Live Europe 2024 for the Data Center Developer? - In just a week or so, Cisco Live EMEA, 2024 will be ready to sizzle at the RAI Amsterdam. From a Cisco Cloud Networking standpoint, Cisco Nexus Dashboard, Cisco ACI, and Nexus 9000 Series switches are showing up in a big way. Read on to learn what ...
1 year ago Feedpress.me
DeepSeek Unveils FlashMLA, A Decoding Kernel That’s Make Things Blazingly Fast - DeepSeek has launched FlashMLA, a groundbreaking Multi-head Latent Attention (MLA) decoding kernel optimized for NVIDIA’s Hopper GPU architecture, marking the first major release of its Open Source Week initiative. This innovative tool achieves ...
4 days ago Cybersecuritynews.com
What do CISOs need to know about API security in 2024? - According to Postman's 2023 State of the API Report, roughly 66% of participants indicated that their APIs contribute to generating revenue. A recent ESG survey on API security showed that 92% of organisations using APIs have experienced a breach in ...
1 year ago Cybersecurity-insiders.com

Cyber Trends (last 7 days)


Trending Cyber News (last 7 days)