AI RISKS: UNSETTLING DEMONSTRATION AI-enabled Intrusions: What Anthropic’s Disclosure Really Means
Last week, AI company Anthropic reported with ‘high confidence’ that a Chinese state-sponsored hacking group had weaponized Anthropic’s own AI tools to run a largely automated cyberattack on several technology firms and government agencies. The September operation is the first publicly known case of an AI system conducting target reconnaissance with only minimal human direction.
Last week, AI company Anthropic reported with ‘high confidence’ that a Chinese state-sponsored hacking group had weaponized Anthropic’s own AI tools to run a largely automated cyberattack on several technology firms and government agencies. According to the company, the September operation is the first publicly known case of an AI system conducting target reconnaissance with only minimal human direction.
In a technical report, Anthropic detailed how the attackers used its tools to generate code that instructed its agent, Claude Code, to execute the campaign, with human operators responsible for as little as 10 to 20 percent of the workload. The company did not reveal how it detected the intrusion or attributed it to China.
Across ASPI’s analyses a consistent picture emerges:
—AI is compressing the human effort required for sophisticated operations.
—Guardrails built for single models will struggle against actors who fragment activity across thousands of instances.
—China’s large Advanced Persistent Threat (APT) ecosystem, based around maintaining unauthorized access to systems for long periods of time undetected, is well positioned to industrialize these gains.
—Cognitive manipulation of AI systems creates entirely new classes of attack.
—The long-term risk is not misuse of Western models, but the maturation of foreign AI ecosystems capable of conducting automated campaigns at scale.
Our challenge is no longer just hardening systems against technical breaches; it is adapting to a threat landscape where human-directed, machine-driven operations compress attack timelines beyond what any single defender can manage. In that environment, effective cybersecurity depends on shared visibility, resilience and coordinated action across like-minded countries. Once reconnaissance and intrusion can be automated at machine pace, no public or private actor can keep up alone. Collective defense becomes how defenders regain time, context and capacity in a contest increasingly shaped by speed.
The following is a collection of analyses from cyber experts, outlining what this development means and why it matters.
Shifting the guardrails: What the attackers actually did
Jason Van der Schyff, ASPI fellow
This incident wasn’t a result of an unusually powerful model, but of operators who knew how to slip past established guardrails. Today’s AI models lack persistent memory across sessions, limiting their ability to join the dots. By slicing their activity into small, harmless-looking queries, it appears the threat actors denied the model the broader context needed to detect malicious intent.
