Post

AI's scary new trick - Conducting cyberattacks instead of just helping out

Anthropic, the company behind the AI assistant Claude, has documented a large-scale cyberattack campaign where AI was used beyond simple assistance. This marks what may be the first recorded instance of a wide-scale cyberattack leveraging AI for almost all phases of an attack. The company detected this sophisticated cyber espionage operation in mid-September, revealing that the AI was used throughout the full attack cycle, targeting approximately 30 organizations.

The AI, referred to as Claude Code, was utilized to create an automated attack framework. This framework was capable of reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration. According to Anthropic, 80% to 90% of tactical operations were operated independently by the AI after being prompted to act as penetration testing orchestrators and agents. The AI identified and exploited vulnerabilities, stole data, and performed other malicious post-exploit activities. The attackers presented tasks as routine technical requests through carefully crafted prompts, inducing Claude to execute individual components of attack chains without awareness of the broader malicious context.

Anthropic attributes the attack to GTG-1002, a Chinese state-sponsored group. After discovering the abuse, Anthropic banned associated accounts and enhanced its malicious activity detection systems to identify novel threat patterns. They also issued a warning to the cybersecurity community, advising security teams to experiment with AI for defense in areas like SOC automation, threat detection, vulnerability assessment, and incident response.

While only a handful of the attacks were successful, Anthropic emphasizes the significance of this event, stating it represents a fundamental shift in how advanced threat actors use AI. The cybersecurity community is urged to acknowledge that a fundamental change has occurred and that continued investment in safeguards across AI platforms is crucial to prevent adversarial misuse. Anthropic is prototyping early-detection measures to stop autonomous cyberattacks and has made authorities and industry parties aware of the incident.

To read the complete article see: AI’s scary new trick - Conducting cyberattacks instead of just helping out

This post is licensed under CC BY 4.0 by the author.