Cybersecurity researchers have unveiled a new autonomous penetration testing agent that leverages large language models (LLMs) to execute commands on real Linux shell systems. ARACNE’s architecture consists of four key components working in tandem: a planner module that creates attack strategies, an interpreter that translates plans into executable Linux commands, an optional summarizer to condense context, and a core agent that orchestrates the process and interacts with target systems. ARACNE, as the agent is called, represents a significant advancement in automated security testing, demonstrating the potential for AI to both strengthen and potentially compromise digital infrastructure. Initial testing has shown ARACNE achieving a 60% success rate against autonomous defenders and nearly 58% against capture-the-flag challenges, outperforming previous state-of-the-art automated penetration testing systems. Unlike traditional penetration testing tools that require manual operation, ARACNE plans attacks, generates shell commands, and evaluates outputs entirely on its own. The agent connects to remote SSH services autonomously and executes commands to achieve specified penetration goals without human intervention. This technique, while essential for legitimate penetration testing, demonstrates how easily existing safeguards in AI systems can be bypassed. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. With years of experience under his belt in Cyber Security, he is covering Cyber Security News, technology and other news.
This Cyber News was published on cybersecuritynews.com. Publication date: Tue, 25 Mar 2025 13:30:07 +0000