Unauthorized Access to Anthropic’s Claude Mythos Model Exposes Vendor Security Gaps

A small group of users gained entry to Anthropic’s restricted Claude Mythos Preview AI model on the day the company announced its launch, exploiting a third‑party vendor environment by guessing the model’s URL. Anthropic confirmed it is investigating the incident and said there is no evidence the breach affected its core systems. The episode highlights vulnerabilities in the way frontier AI tools are shielded behind external partners, raising concerns about the security of powerful cybersecurity AI models that can autonomously discover and exploit zero‑day vulnerabilities.

Anthropic disclosed that a private Discord community accessed its Claude Mythos Preview model on the same day the company publicly unveiled the AI system. The group, which focuses on gathering intelligence about unreleased models, reportedly guessed the model’s endpoint by leveraging knowledge of Anthropic’s URL conventions. By doing so, they entered a third‑party vendor environment that hosted the model, bypassing Anthropic’s own infrastructure.

"We’re investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third‑party vendor environments," the company said in a statement. Anthropic added that, so far, it has found no evidence that the incident impacted its core systems or spread beyond the vendor’s environment.

Claude Mythos is a highly specialized cybersecurity AI capable of autonomously identifying thousands of previously unknown zero‑day vulnerabilities across major operating systems and browsers, then generating functional exploits. In internal testing, the model linked multiple vulnerabilities to escape both renderer and operating system sandboxes—a task that would normally require months of expert effort.

To mitigate the risk of weaponizing such capabilities, Anthropic introduced Project Glasswing, limiting access to twelve named launch partners—including Amazon Web Services, Apple, Microsoft, Google, Nvidia, and Palo Alto Networks—plus Anthropic itself. Around 40 additional organizations received temporary access, and the rollout was paired with $100 million in usage credits and $4 million in donations to open‑source security projects.

The unauthorized entry undermines the premise of the restricted rollout. While the infiltrating group described its motives as curiosity‑driven, the ability of Mythos to produce weaponizable code means intent offers little protection against misuse.

The breach also carries political weight. It arrived a day after former President Donald Trump suggested a Pentagon deal with Anthropic was “possible” on CNBC, and while Anthropic is suing the Department of Defense over a blacklist designation, the incident supplies critics with a tangible example of the challenges in governing access to advanced AI tools.

Anthropic’s investigation points to a specific failure mode: the gap between the company’s internal security controls and those of a third‑party contractor with access credentials. Rather than a classic intrusion, the actors exploited the vendor’s environment, highlighting the need for tighter oversight of external partners that host high‑risk AI models.

As the AI industry grapples with the balance between defensive capabilities and potential misuse, the Mythos incident may prompt tighter contractual safeguards and more rigorous vetting of vendor environments. For now, Anthropic remains focused on confirming that its core systems are intact while assessing how the breach could influence ongoing legal and regulatory battles.

Unauthorized Access to Anthropic’s Claude Mythos Model Exposes Vendor Security Gaps

Key Points

Also available in: