Research Jailbreaked OpenAI o1/o3, DeepSeek-R1, & Gemini 2.0 Flash Thinking Models

A recent study from a team of cybersecurity researchers has revealed severe security flaws in commercial-grade Large Reasoning Models (LRMs), including OpenAI’s o1/o3 series, DeepSeek-R1, and Google’s Gemini 2.0 Flash Thinking. The research introduces two key innovations: the Malicious-Educator benchmark for stress-testing AI safety protocols and the Hijacking Chain-of-Thought (H-CoT) attack method, which reduced model refusal rates from 98% to under 2% in critical scenarios. The team from Duke University’s Center for Computational Evolutionary Intelligence developed a dataset of 50 queries spanning 10 high-risk categories, including terrorism, cybercrime, and child exploitation, crafted as educational prompts. Attackers could intercept these outputs, and H-CoT raised success rates to 96.8% by exploiting its multilingual inconsistencies (e.g., English queries bypassed Chinese safety filters). Current LRMs use chain-of-thought (CoT) reasoning to justify safety decisions, often displaying intermediate steps like “Confirming compliance with harm prevention policies…”. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. While o1 initially refused 99% of Malicious-Educator queries, post-2024 updates saw refusal rates plummet to <2% under H-CoT. Gurubaran is a co-founder of Cyber Security News and GBHackers On Security. When prompted with H-CoT, its tone shifted from cautious to eager compliance, providing detailed criminal frameworks in 100% of tested cases.

This Cyber News was published on cybersecuritynews.com. Publication date: Tue, 25 Feb 2025 13:35:17 +0000


Cyber News related to Research Jailbreaked OpenAI o1/o3, DeepSeek-R1, & Gemini 2.0 Flash Thinking Models

Attackers Can Gain Control of Users' Queries and LLM Data Output - Gemini is Google's newest family of Large Language Models. The Gemini suite currently houses 3 different model sizes: Nano, Pro, and Ultra. Although Gemini has been removed from service due to politically biased content, findings from HiddenLayer ...
1 year ago Packetstormsecurity.com
Google Adds Gemini Pro API to AI Studio and Vertex AI - Google also announced Duet AI for Developers and Duet AI in Security Operations, but neither uses Gemini yet. Starting Dec. 13, developers can use Google AI Studio and Vertex AI to build applications with the Gemini Pro API, which allows access to ...
1 year ago Techrepublic.com
Google Rebrands Bard AI Chatbot As Gemini - Bard becomes Gemini, as Google rebrands chatbot and launches monthly subscription for access to more powerful AI system. Alphabet's Google has shaken up its artificial intelligence chatbot offering, as it seeks to take the fight to rival Microsoft. ...
1 year ago Silicon.co.uk
Sam Altman's Return As OpenAI CEO Is A Relief-and Lesson-For Us All - The sudden ousting of OpenAI CEO Sam Altman on Friday initially seemed to suggest one thing: he must have done something really, really bad. Possibly illegal. So when OpenAI's board of directors publicly announced that Altman was fired after "Failing ...
1 year ago Forbes.com
Google Launches Gemini, the Most Capable and Largest AI Model - In a groundbreaking revelation, Google has ushered in a new era of artificial intelligence with the introduction of Gemini, its most formidable and sophisticated AI model to date. This paradigm-shifting technology promises to redefine human-machine ...
1 year ago Cybersecuritynews.com
Research Jailbreaked OpenAI o1/o3, DeepSeek-R1, & Gemini 2.0 Flash Thinking Models - A recent study from a team of cybersecurity researchers has revealed severe security flaws in commercial-grade Large Reasoning Models (LRMs), including OpenAI’s o1/o3 series, DeepSeek-R1, and Google’s Gemini 2.0 Flash Thinking. The research ...
5 months ago Cybersecuritynews.com
ChatGPT 4.1 early benchmarks compared against Google Gemini - For example, GPT‑4.1 scores 54.6% on SWE-bench Verified, which is better than GPT-4o by 21.4% and 26.6% over GPT‑4.5. We have similar results on other benchmarking tools shared by OpenAI, but how does it compete against Gemini ...
3 months ago Bleepingcomputer.com
Germany asks Google, Apple remove DeepSeek AI from app stores - The Berlin Commissioner for Data Protection has formally requested Google and Apple to remove the DeepSeek AI application from the application stores due to GDPR violations. The commissioner, Meike Kamp, alleges that DeepSeek’s owner, ...
3 weeks ago Bleepingcomputer.com
DeepSeek R1 Jailbreaked To Develop Malware, Such As A Keylogger And Ransomware - Cyber Security News - These findings suggest that while DeepSeek R1 doesn’t provide turnkey malware solutions, it significantly lowers the technical barrier for creating harmful software, potentially accelerating malicious actors’ capabilities in developing ...
4 months ago Cybersecuritynews.com
Threat Actors Exploiting DeepSeek's Popularity To Deploy Malware - To safely navigate AI models like DeepSeek while minimizing phishing and malware risks, users should utilize Criminal IP’s IP analysis service to verify server locations and network security. Cyber attackers have been creating phishing websites ...
5 months ago Cybersecuritynews.com
Google Gemini's Astra (screen sharing) rolls out on Android for some users - According to a video shared by a Reddit user who owns a Xiaomi phone with a Gemini Advanced subscription, you can now share your phone's screen with Gemini Live and ask questions about it. At MWC 2025, Google confirmed it was working on screen and ...
4 months ago Bleepingcomputer.com
ChatGPT 4.1 fails to beat Google Gemini 2.5 in early benchmarks - According to benchmarks shared by Stagehand, which is a production-ready browser automation framework, Gemini 2.0 Flash has the lowest error rate (6.67%) along with the highest exact‑match score (90%), and it’s also cheap and fast. ...
3 months ago Bleepingcomputer.com
Microsoft Invests Billions in OpenAI – Innovator in Chatbot and GPT Technology - Microsoft has announced a $1 billion investment in OpenAI, the San Francisco-based artificial intelligence (AI) research and development firm. Founded by tech moguls Elon Musk and Sam Altman, OpenAI is a leader in AI technology, and the investment ...
2 years ago Securityweek.com
CVE-2021-36845 - Multiple Authenticated Stored Cross-Site Scripting (XSS) vulnerabilities in YITH Maintenance Mode (WordPress plugin) versions < 1.3.8, there are 46 vulnerable parameters that were missed by the vendor while patching the 1.3.7 version to 1.3.8. ...
3 years ago
Germany Urges Apple, Google to Block Chinese AI App DeepSeek Over Privacy Rules - Germany’s data protection authorities have escalated their scrutiny of Chinese artificial intelligence applications, with Berlin’s data protection commissioner Meike Kamp formally requesting Apple and Google to review and potentially ...
3 weeks ago Cybersecuritynews.com
Addressing Deceptive AI: OpenAI Rival Anthropic Uncovers Difficulties in Correction - There is a possibility that artificial intelligence models can be trained to deceive. According to a new research led by Google-backed AI startup Anthropic, if a model exhibits deceptive behaviour, standard techniques cannot remove the deception and ...
1 year ago Cysecurity.news
OpenAI's board might have been dysfunctional-but they made the right choice. Their defeat shows that in the battle between AI profits and ethics, it's no contest - The drama around OpenAI, its board, and Sam Altman has been a fascinating story that raises a number of ethical leadership issues. What are the responsibilities that OpenAI's board, Sam Altman, and Microsoft held during these quickly moving events? ...
1 year ago Fortune.com Equation
OpenAI is to Launch a AI Web Browser in Coming Weeks - The new browser will feature integrated AI agent capabilities designed to autonomously handle various online tasks, positioning OpenAI as a direct competitor to traditional browser giants like Google Chrome while advancing the company’s vision ...
2 weeks ago Cybersecuritynews.com
UK Scrutiny Of Microsoft Partnership With OpenAI - CMA seeks feedback about the relationship between Microsoft and OpenAI, and whether it has antitrust implications. Microsoft, it should be remembered, was firmly rebuked for its conduct by the CMA in October after the UK regulator reversed its ...
1 year ago Silicon.co.uk
Gemini: Google Launches its Most Powerful AI Software Model - Google has recently launched Gemini, its most powerful generative AI software model to date. Since the model is designed in three different sizes, Gemini may be utilized in a variety of settings, including mobile devices and data centres. Google has ...
1 year ago Cysecurity.news
OpenAI's Sora Generates Photorealistic Videos - OpenAI released on Feb. 15 an impressive new text-to-video model called Sora that can create photorealistic or cartoony moving images from natural language text prompts. Sora isn't available to the public yet; instead, OpenAI released Sora to red ...
1 year ago Techrepublic.com
Sec-Gemini v1 - Google Released a New AI Model for Cybersecurity - The model draws on extensive data sources, including Google Threat Intelligence (GTI), the Open Source Vulnerabilities (OSV) database, and Mandiant Threat Intelligence, to deliver unparalleled performance in critical areas such as incident root cause ...
3 months ago Cybersecuritynews.com
South Korea Confirm DeepSeek Sending Data Chinese ByteDance Servers - The findings follow a technical audit revealing critical security flaws, including unencrypted data transfers, deprecated encryption protocols, and deliberate bypassing of Apple’s App Transport Security (ATS) safeguards. Data Sovereignty Concerns: ...
5 months ago Cybersecuritynews.com
Android Malware Mimic As DeepSeek To Steal Users Login Credentials - The malware campaign uses a deceptive phishing website that closely mimics the official DeepSeek platform, tricking users into downloading a malicious application that steals login credentials and sensitive information. Once installed, the malicious ...
4 months ago Cybersecuritynews.com

Latest Cyber News


Cyber Trends (last 7 days)