Anthropic's Mythos AI Model Raises Alarm Over Surge in AI-Driven Hacking

Anthropic's new Mythos AI model has sparked concern among security experts after data from CrowdStrike showed AI‑enabled cyber attacks jump 89 percent in 2025. The model's ability to automate vulnerability hunting could overwhelm defenders, with internal warnings that companies may discover more flaws than they can patch. Recent incidents, including a Chinese‑linked AI espionage campaign that used Anthropic's Claude Code to breach dozens of high‑profile targets, underscore the growing threat. Analysts argue that granting AI agents unrestricted access to data, the internet, and external communication creates a “lethal trifecta” for hackers.

AI‑enabled cyber attacks surged 89 percent in 2025, according to CrowdStrike, and the average dwell time for attackers shrank to just 29 minutes—down 65 percent from the previous year. The rapid acceleration coincides with the rollout of Anthropic's Mythos model, a powerful AI system designed to automate vulnerability discovery and exploit generation.

Industry insiders say Mythos could tip the balance in favor of attackers. "The game is asymmetric; it is easier to identify and exploit than to patch everything in time," a source close to a frontier AI lab told the Financial Times. Anthropic's own Graham echoed those worries, noting that companies might uncover "more vulnerabilities than they could hope to deal with in the near future" if they deploy Mythos without strict safeguards.

Last September, Anthropic detected the first reported AI‑driven cyber‑espionage campaign attributed to a Chinese state‑sponsored group. The actors weaponized Anthropic's Claude Code, a coding assistant, to infiltrate roughly 30 global targets, ranging from major tech firms and financial institutions to chemical manufacturers and government agencies. While the campaign achieved limited success, it required minimal human oversight, highlighting the potential for AI agents to operate autonomously in hostile environments.

Software researcher Simon Willison warned that AI agents create a "lethal trifecta" of risk: access to private data, exposure to untrusted internet content, and the ability to communicate externally. Security professionals recommend restricting AI agents to only two of these three domains to mitigate danger. Yet many AI experts argue that the full value of agents comes from unrestricted access, creating a tension between utility and safety.

"The bad news is that there is no good solution as of today," said another source close to an AI lab. "The good news is [AI agents aren’t] yet in mission‑critical settings like the stock exchange, bank ledger, or the airport." This caveat underscores the current limits of AI deployment in high‑stakes infrastructure, but it does not diminish the urgency of the threat.

Potential for Defensive Use

Former Anthropic and Google DeepMind researcher Stanislav Fort, now founder of AI security platform AISLE, offered a more optimistic view. He believes AI could eventually catalog and remediate a "finite repository" of historical security flaws. To date, AI models have uncovered thousands of zero‑day vulnerabilities—unknown weaknesses that have lingered in software for years. Fort noted, "We are gradually finding fewer and fewer zero days, of the worst kinds we can imagine." If these gaps are closed, the technology could shift from a weapon to a shield, proactively blocking threats and raising the overall security baseline.

For now, the balance remains precarious. The combination of faster attack cycles, AI‑driven tools like Mythos, and the ease of automating complex exploits forces defenders to reassess traditional security practices. Organizations may need to adopt stricter AI governance, limit agent permissions, and invest in AI‑augmented defense tools to keep pace.

Additional reporting by Kieran Smith in London.

Anthropic's Mythos AI Model Raises Alarm Over Surge in AI-Driven Hacking

Key Points

Potential for Defensive Use

Also available in: