While the security analysts noted that unlike conventional backdoor attacks that rely on poisoned training data or overt triggers in user prompts, DarkMind embeds latent triggers directly into the model’s reasoning chain. Dubbed DarkMind, this backdoor attack exploits the reasoning capabilities of LLMs to covertly manipulate outputs without requiring direct user query manipulation. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. Instant Triggers (τIns): Modify subsequent reasoning steps immediately upon activation (replacing correct arithmetic operators with incorrect ones). The attack raises critical concerns about the security of AI agents deployed across platforms like OpenAI’s GPT Store, which hosts over 3 million customized models. These triggers activate during intermediate processing steps, dynamically altering the final output while leaving the model’s surface-level behavior intact. Retrospective Triggers (τRet): Append malicious reasoning steps after initial processing to reverse or distort conclusions. With years of experience under his belt in Cyber Security, he is covering Cyber Security News, technology and other news. A groundbreaking study by researchers Zhen Guo and Reza Tourani at Saint Louis University has exposed a novel vulnerability in customized large language models (LLMs) like GPT-4o and LLaMA-3. The attack modifies intermediate CoT steps while maintaining plausible final outputs, rendering detection through output monitoring nearly impossible. DarkMind targets the Chain-of-Thought (CoT) reasoning process—the step-by-step logic LLMs use to solve complex tasks. The researchers tested DarkMind across eight datasets spanning arithmetic, commonsense, and symbolic reasoning tasks.
This Cyber News was published on cybersecuritynews.com. Publication date: Tue, 18 Feb 2025 11:05:13 +0000