Anthropic probes unauthorized access to Claude Mythos AI security model

Anthropic confirmed it is investigating a report that a group gained unauthorized entry to its Claude Mythos model through a third‑party vendor portal. The breach, discovered via internet‑sleuthing tools and a developer portal, appears limited to exploratory testing rather than malicious exploitation. Anthropic’s Claude Mythos, released under the Project Glasswing preview, had been limited to a handful of trusted firms such as Amazon, Microsoft, Apple, Cisco and Mozilla, which used the model to identify hundreds of software flaws. The incident has revived concerns about AI‑driven cyber threats and the company’s recent designation as a supply‑chain risk by the U.S. Department of Defense.

Anthropic said on Thursday it is probing a claim that an external group accessed its Claude Mythos model without permission. The company’s statement referenced a report that the intrusion occurred through a third‑party contractor environment and was facilitated by internet‑sleuthing tools. While the intruders allegedly managed to reach the model, sources close to the matter said they were only interested in testing its capabilities, not in deploying it for malicious attacks.

Claude Mythos debuted earlier this month as part of the company’s Project Glasswing initiative. Anthropic limited the preview to a select roster of trusted test partners, including Amazon, Microsoft, Apple, Cisco and the Mozilla Foundation. Mozilla disclosed that the model helped its engineers uncover and patch 271 vulnerabilities in the Firefox browser, a success that spurred interest from banks and government agencies seeking to harden their own systems.

According to the report, the unauthorized users operated a private Discord channel where they exchanged details about the breach. investigators believe the group guessed the location of the model within Anthropic’s developer portal and used that foothold to explore other unreleased AI models. No evidence suggests the intruders extracted data or launched attacks using the model.

The episode has reignited debate over the security implications of AI tools that can automatically sniff out software flaws. Alex Zenla, chief technology officer of cloud‑security firm Edera, told Wired that the potential for AI‑generated cyber attacks remains a “real threat.” Some security researchers, however, remain skeptical about the model’s capabilities, noting that early demonstrations have sometimes overpromised.

Anthropic’s challenges extend beyond the technical breach. Last month the U.S. Department of Defense labeled the company a "supply chain risk," a designation that could restrict government contracts. Anthropic officials have been in talks with the Trump administration to have the label removed, arguing that the company’s security practices and partnership vetting processes are robust.

For now, Anthropic says it is conducting a thorough internal review and working with the affected third‑party vendor to tighten access controls. The firm has not disclosed whether any data was exfiltrated or whether other models were compromised. As the investigation continues, industry observers will watch closely to see how the company balances rapid AI innovation with the growing demand for cybersecurity safeguards.

Anthropic probes unauthorized access to Claude Mythos AI security model

Key Points

Also available in: