Anthropic’s Claude Mythos Model Accessed by Unauthorized Users, Company Confirms

Key Points
- Anthropic confirms unauthorized access to Claude Mythos since its launch day.
- Attackers used an educated guess based on Mercur breach data and insider knowledge.
- No evidence the group used Mythos for malicious cybersecurity activities.
- Anthropic is investigating and plans to strengthen its monitoring and logging.
- Security experts call the breach a predictable, low‑tech attack rather than a sophisticated exploit.
- Earlier leaks exposed Mythos' existence, and U.S. agencies have already obtained access.
- The incident highlights the human element as a persistent security vulnerability.
Anthropic disclosed that a small group of unauthorized users gained access to its newly released Claude Mythos model on the day the company announced a limited rollout. According to Bloomberg, the intruders guessed the model’s online location using details leaked from a prior breach at data‑training firm Mercur and insider knowledge from a contractor who had evaluated Anthropic’s models. Anthropic said it is investigating the incident and reviewing its monitoring systems, which were designed to log and track model usage. The breach, described by security researchers as a standard “educated guess” attack rather than a sophisticated exploit, did not appear to target the model’s advertised cybersecurity capabilities. The episode raises questions about the robustness of Anthropic’s security controls for a product it has marketed as a “watershed moment” for defending digital infrastructure.
Anthropic announced on Monday that a handful of unauthorized users have been accessing its Claude Mythos model since the day the company opened a tightly controlled testing program. The model, touted as a breakthrough in cybersecurity analysis, was meant to be available only to a select group of partner firms.
Bloomberg reported that the intruders arrived at the model’s endpoint by piecing together information from two sources. First, a separate breach at Mercur—a firm that supplies AI training data—exposed details about Anthropic’s infrastructure. Second, a contractor who previously evaluated Anthropic’s models inadvertently provided insider knowledge that helped the attackers make an educated guess about where Mythos was hosted. The group did not employ a sophisticated zero‑day exploit; instead, they leveraged publicly available clues and a lucky guess, a technique security experts say is routine in the industry.
Anthropic confirmed the breach and said an internal investigation is underway. The company’s security team, which can log and track model usage, admitted that monitoring was not sufficient to flag the unauthorized access promptly. "We are reviewing our detection and response procedures to ensure that any future attempts are identified in real time," a spokesperson said.
According to security researcher Lukasz Olejnik, the incident illustrates a predictable failure that firms have been dealing with for two decades. "It’s an entirely imaginable scenario," he noted, stressing that any organization reliant on human‑controlled access points should anticipate such guess‑based attacks.
While Anthropic markets Mythos as a tool capable of finding vulnerabilities in every major operating system and web browser, Bloomberg indicated the unauthorized users were not exploiting the model for cybersecurity work. Their motive appeared to be curiosity and the desire to "mess around" with a high‑profile AI system, a behavior that may have helped keep the breach from escalating further.
The episode comes after earlier missteps surrounding Mythos. The model’s existence was unintentionally revealed in an unsecured data trove on Anthropic’s website, and U.S. agencies such as the NSA have reportedly obtained access despite the model being labeled a supply‑chain risk. The rollout also bypassed the Cybersecurity and Infrastructure Security Agency (CISA), raising concerns about coordination with federal cybersecurity oversight.
Industry observers see the breach as a sobering reminder that even firms that champion AI safety can fall victim to basic security oversights. Royal United Services Institute fellow Pia Hüsch warned that human error often remains the weakest link in any security chain. She added that the breach’s simplicity does not diminish its impact; it showcases how a modest set of clues can open a doorway to advanced AI resources.
Anthropic’s next steps will likely involve tightening access controls, improving real‑time monitoring, and possibly revisiting the model’s release strategy. The company’s brand, built on a reputation for rigorous safety standards, now faces scrutiny over whether its internal safeguards matched the public narrative of a highly secure, responsibly deployed AI system.