Google Researchers Find ChatGPT Queries Collect Personal Data

The LLMs are evolving rapidly with continuous advancements in their research and applications. Recently, cybersecurity researchers at Google discovered how threat actors can exploit ChatGPT queries to collect personal data. StorageGuard scans, detects, and fixes security misconfigurations and vulnerabilities across hundreds of storage and backup devices. Cybersecurity analysts developed a scalable method that detects memorization in trillions of tokens, analyzing open-source and semi-open models. Researchers identified that the larger and more capable models are vulnerable to data extraction attacks. GPT-3.5-turbo shows minimal memorization due to alignment as a helpful chat assistant. Using a new prompting strategy, the model diverges from chatbot-style responses, resembling a base language model. Researchers test its output against a nine-terabyte web-scale dataset, recovering over ten thousand training examples at a $200 query cost, with the potential for extracting 10× more data. Security analysts assess past extraction attacks in a controlled setting, focusing on open-source models with publicly available training data. 's method, they downloaded 108 bytes from Wikipedia, generating prompts by sampling continuous 5-token blocks. Unlike prior methods, they directly query the model's open-source training data to evaluate attack efficacy, eliminating the need for manual internet searches. Researchers tested their attack on 9 open-source models tailored for scientific research, providing access to their complete training, pipeline, and dataset for study. Semi-closed models have downloadable parameters but undisclosed training datasets and algorithms. Despite generating outputs similarly, establishing 'ground truth' for extractable memorization requires experts due to inaccessible training datasets. While extracting the data from ChatGPT, researchers found two major challenges, and here below, we have mentioned those challenges:-. Researchers extract training data from ChatGPT through a divergent attack, but it lacks generalizability to other models. Despite limitations in testing for memorization, they use known samples from the extracted training set to measure discoverable memorization. For the 1,000 longest memorized examples, they prompt ChatGPT with the first N−50 tokens and generate a 50-token completion to assess discoverable memorization. ChatGPT is highly susceptible to data extraction attacks due to over-training for extreme-scale, high-speed inference. The trend of over-training on vast amounts of data poses a trade-off between privacy and inference efficiency. Speculation arises about ChatGPT's multiple-epoch training, potentially amplifying memorization and allowing easy extraction of training data. Experience how StorageGuard eliminates the security blind spots in your storage systems by trying a 14-day free trial.

This Cyber News was published on cybersecuritynews.com. Publication date: Thu, 30 Nov 2023 21:55:08 +0000

Cyber News related to Google Researchers Find ChatGPT Queries Collect Personal Data

Researchers Uncover Simple Technique to Extract ChatGPT Training Data - Can getting ChatGPT to repeat the same word over and over again cause it to regurgitate large amounts of its training data, including personally identifiable information and other data scraped from the Web? The answer is an emphatic yes, according to ...
1 year ago Darkreading.com

XSS Marks the Spot: Digging Up Vulnerabilities in ChatGPT - With its widespread use among businesses and individual users, ChatGPT is a prime target for attackers looking to access sensitive information. In this blog post, I'll walk you through my discovery of two cross-site scripting vulnerabilities in ...
1 year ago Imperva.com

How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog - After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how ...
10 months ago Aws.amazon.com

Google Researchers' Attack Prompts ChatGPT to Reveal Its Training Data - A team of researchers primarily from Google's DeepMind systematically convinced ChatGPT to reveal snippets of the data it was trained on using a new type of attack prompt which asked a production model of the chatbot to repeat specific words forever. ...
1 year ago 404media.co

Privacy Policy 2024 - Personal information is any information that identifies you or would enable someone to contact you, which may include your name, email address, phone number and other non-public information that is associated with such information. Information We ...
1 year ago Bitsight.com

How enterprises are using gen AI to protect against ChatGPT leaks - ChatGPT is the new DNA of shadow IT, exposing organizations to new risks no one anticipated. Enterprise workers are gaining a 40% performance boost thanks to ChatGPT based on a recent Harvard University study. A second study from MIT discovered that ...
1 year ago Venturebeat.com

ChatGPT Clone Apps Collecting Personal Data on iOS, Play Store - On Android devices, one of the apps analyzed by researchers has more than 100,000 downloads, tracks, and shares location data with ByteDance and Amazon, etc. ChatGPT, the AI software, has already taken the Internet by storm, and that is why ...
2 years ago Hackread.com Everest

Google Researchers Find ChatGPT Queries Collect Personal Data - The LLMs are evolving rapidly with continuous advancements in their research and applications. Recently, cybersecurity researchers at Google discovered how threat actors can exploit ChatGPT queries to collect personal data. StorageGuard scans, ...
1 year ago Cybersecuritynews.com

ChatGPT Extensions Could be Exploited to Steal Data and Sensitive Information - API security professionals Salt Security have released new threat research from Salt Labs highlighting critical security flaws within ChatGPT plugins, presenting a new risk for enterprises. Plugins provide AI chatbots like ChatGPT access and ...
1 year ago Itsecurityguru.org

How Are Security Professionals Managing the Good, The Bad and The Ugly of ChatGPT? - ChatGPT has emerged as a shining light in this regard. Already we're seeing the platform being integrated into corporate systems, supporting in areas such as customer success or technical support. The bad: The risks surrounding ChatGPT. Of course, ...
1 year ago Cyberdefensemagazine.com

Google to Announce Chat-GPT Rival On February 8 Event - There seems to be a lot of consternation on Google's part at the prospect of a showdown with ChatGPT on the February 8 event. The search giant has been making moves that suggest it is preparing to enter the market for large language models, where ...
2 years ago Cybersecuritynews.com

Are you sure you want to share that with ChatGPT? How Metomic helps stop data leaks - Open AI's ChatGPT is one of the most powerful tools to come along in a lifetime, set to revolutionize the way many of us work. Workers aren't content to wait until organizations work this question out, however: Many are already using ChatGPT and ...
1 year ago Venturebeat.com

Data De-Identification: Balancing Privacy, Efficacy & Cybersecurity - COMMENTARY. Global data privacy laws were created to address growing consumer concerns about individual privacy. These laws include several best practices for businesses about storing and using consumers' personal data so that the exposure of ...
1 year ago Darkreading.com

Locking Down ChatGPT: A User's Guide to Strengthening Account Security - OpenAI officials said that the user who reported his ChatGPT history was a victim of a compromised ChatGPT account, which resulted in the unauthorized logins. OpenAI has confirmed that the unauthorized logins originate from Sri Lanka, according to an ...
1 year ago Cysecurity.news

Researchers Claim Design Flaw in Google Workspace Puts Organizations at Risk - Google is disputing a security vendor's report this week about an apparent design weakness in Google Workspace that puts users at risk of data theft and other potential security issues. According to Hunters Security, a flaw in Google Workspace's ...
1 year ago Darkreading.com Hunters

Foreign states already using ChatGPT maliciously, UK IT leaders believe - Most UK IT leaders believe that foreign states are already using the ChatGPT chatbot for malicious purposes against other nations. That's according to a new study from BlackBerry, which surveyed 500 UK IT decision makers revealing that, while 60% of ...
2 years ago Csoonline.com

The Emergence of AI In the Enterprise: Know the Security Risks - As is often the case with any new, emerging technology, using AI comes with security risks, and it's essential to understand them and impose the proper guardrails around them to protect company, customer, and employee data. There are real, tangible ...
1 year ago Cyberdefensemagazine.com

Google Cloud Next 2024: New Data Center Chip Joins Ecosystem - Google Cloud announced a new enterprise subscription for Chrome and a bevy of generative AI add-ons for Google Workspace during the Cloud Next '24 conference, held in Las Vegas from April 9 - 11. Overall, Google Cloud is putting its Gemini generative ...
1 year ago Techrepublic.com

Google DeepMind Researchers Uncover ChatGPT Vulnerabilities - Scientists at Google DeepMind, leading a research team, have adeptly utilized a cunning approach to uncover phone numbers and email addresses via OpenAI's ChatGPT, according to a report from 404 Media. This discovery prompts apprehensions regarding ...
1 year ago Cysecurity.news

ChatGPT Spills Secrets in Novel PoC Attack - A team of researchers from Google DeepMind, Open AI, ETH Zurich, McGill University, and the University of Washington have developed a new attack for extracting key architectural information from proprietary large language models such as ChatGPT and ...
1 year ago Darkreading.com

Ahead of Regulatory Wave: Google's Pivotal Announcement for EU Users - Users in the European Union will be able to prevent Google services from sharing their data across different services if they do not wish to share their data. Google and five other large technology companies must comply with the EU's Digital Markets ...
1 year ago Cysecurity.news

ChatGPT Conversations are Being Indexed by Search Engines! - On August 1, 2025, the company’s Chief Information Security Officer, Dane Stuckey, announced the removal of the discoverable feature: “We just removed a feature from ChatGPT that allowed users to make their conversations discoverable by ...
1 week ago Cybersecuritynews.com

OpenAI rolls out imperfect fix for ChatGPT data leak flaw - OpenAI has mitigated a data exfiltration bug in ChatGPT that could potentially leak conversation details to an external URL. According to the researcher who discovered the flaw, the mitigation isn't perfect, so attackers can still exploit it under ...
1 year ago Bleepingcomputer.com

New ChatGPT's Premium Features Subscription Phishing Attack Steal Logins - A typical message includes the subject line “Action Required: Secure Continued Access to ChatGPT with a $24 Monthly Subscription” and spoofs the sender address as noreply@chatgpt-auth[.]net—a domain registered through PrivacyGuardian.org just ...
5 months ago Cybersecuritynews.com

Denmark orders schools to stop sending student data to Google - The Danish data protection authority has issued an injunction regarding student data being funneled to Google through the use of Chromebooks and Google Workspace services in the country's schools. The matter was brought to the agency's attention ...
1 year ago Bleepingcomputer.com