Anthropic just dropped the biggest flex in the AI arms race. Claude Fable 5, the company’s first publicly available “Mythos-class” model, scored 161 on the Epoch Capabilities Index, edging past OpenAI’s GPT-5.5 Pro, which landed at 159.
The numbers that matter
On FrontierMath tier 4, widely considered one of the hardest mathematical reasoning evaluations in AI, Fable 5 scored 88 percent. GPT-5.5 Pro managed roughly 75 percent. That’s a 13-point gap on a test designed to push frontier models to their limits.
Fable 5 posted an 80.3 percent on SWE-Bench Pro, a benchmark that tests AI models on real-world software engineering tasks pulled from actual GitHub repositories. Anthropic’s own previous flagship, Opus 4.8, scored 69.2 percent on the same benchmark. OpenAI’s GPT-5.5 trailed further at 58.6 percent.
In English: Fable 5 can solve roughly four out of five real coding problems thrown at it, compared to three out of five for its predecessor and barely more than half for GPT-5.5. For companies evaluating which AI to plug into their development workflows, that difference translates directly into productivity.
What makes Fable 5 different
Anthropic is calling Fable 5 the first model in its new “Mythos” tier, positioned above the Opus line that previously represented the company’s most capable offerings.
One of the more interesting design choices involves safety architecture. When Fable 5 encounters restricted queries, it defaults to the less advanced Opus 4.8 rather than attempting to generate a response at its full capability level.
The model also emphasizes long-term autonomous operations, suggesting Anthropic is targeting use cases where AI agents work independently over extended periods rather than just responding to one-off prompts.
For pricing, Fable 5 is available through paid Claude plans at $10 to $50 per million tokens until June 22, 2026.
What this means for the AI investment landscape
Fable 5’s SWE-Bench Pro dominance is the most commercially relevant data point in this entire release. An 80.3 percent success rate on real software engineering tasks means the model is approaching the threshold where it becomes genuinely cost-effective as a coding assistant for professional development teams. The gap between Fable 5 and GPT-5.5 on this specific benchmark, over 21 percentage points, is the kind of margin that drives enterprise purchasing decisions.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

1 hour ago
1
















English (US) ·