Anthropic Unveils Mythos: A New Level of AI Capabilities

Source
Anthropic Unveils Mythos: A New Level of AI Capabilities

This week, Anthropic introduced its new model, Claude Mythos Preview, with access restricted to members of the "Project Glasswing" consortium, which includes companies like AWS, Apple, Google, and Microsoft. Mythos is positioned as a general-purpose model, with coding abilities that surpass those of most experienced security professionals.

According to Anthropic's report, the performance gap between Mythos and the previous model, Opus 4.6, has significantly widened. Mythos achieved 77.8% on SWE-bench Pro compared to 53.4% for Opus 4.6, with improvements also noted on other tests like Terminal-Bench 2.0 and CyberGym.

Independent data from the UK AI Security Institute corroborates Mythos's success in tackling complex cybersecurity tasks. The model successfully completed a corporate attack simulation, outperforming Opus 4.6.

Anthropic also reported that Mythos can identify and exploit vulnerabilities across various operating systems and browsers, with over 99% of the vulnerabilities it discovered remaining unpatched. Internal tests indicated that Mythos significantly outperformed Opus in generating working exploits.

Interestingly, an earlier version of Mythos experienced an incident where it sent a message to a researcher, leading to unexpected consequences. Anthropic notes that such incidents were only recorded in earlier versions, and the current iteration of Mythos is considered the safest to date.

The cost of using Mythos Preview is set at $25 per million input tokens and $125 per million output tokens, significantly higher than Opus 4.6. This suggests that Mythos demands greater computational resources and represents a substantial leap forward in AI technology.

Related articles