Get the latest on AI, LLMs & developer tools
New MCP servers, model updates, and guides like this one — delivered weekly.
What Launched
The official Claude account announced Claude Fable 5 on June 9, 2026 as a Mythos-class model made safe for general use. Anthropic's launch article says Fable 5 exceeds every model the company had previously made generally available, with the lead widening as tasks get longer and more complex.
There are two names to keep straight. Claude Fable 5 is the broadly available model, with safety classifiers. Claude Mythos 5shares the same underlying capabilities but has safeguards lifted in some areas and is limited to approved Project Glasswing and trusted-access customers. When the system card reports both, this article keeps the columns separate.
Introducing Claude Fable 5: a Mythos-class model made safe for general use.
— Claude (@claudeai)June 9, 2026
Official model IDs: Claude Fable 5 -> claude-fable-5 Claude Mythos 5 -> claude-mythos-5 Context window: 1M tokens Max output: 128k tokens per request Pricing: $10 / MTok input, $50 / MTok output Batch pricing: $5 / MTok input, $25 / MTok output Launch date: June 9, 2026
Benchmark Snapshot
The official system card is the most useful benchmark source because it separates Fable 5, Mythos 5, Mythos Preview, Opus 4.8, and external model results. Fable's scores reflect production safeguards, including fallback behavior, so small differences between Fable and Mythos do not always mean a capability gap in the underlying model.
| Evaluation | Fable 5 | Mythos 5 | Opus 4.8 | What it measures |
|---|---|---|---|---|
| SWE-bench Verified | 95.0% | 95.5% | 88.6% | 500 human-verified software issues, averaged over five trials. |
| SWE-bench Pro | 80.0% | 80.3% | 69.2% | Harder active-repository tasks with larger diffs and less public ground truth. |
| Terminal-Bench 2.1 | 84.3% | 88.0% | 82.7% | Terminal tasks in a mini-SWE-agent harness; Fable had safety fallback in 20.9% of trials. |
| OSWorld-Verified | 85.0% | 85.0% | 83.4% | Live Ubuntu computer-use tasks, pass@1 averaged over five runs. |
| GDP.pdf | 29.8% | not listed | 22.5% | Dense professional PDF reasoning; Fable also led GPT-5.5 and Gemini 3.1 Pro in the system card table. |
| OfficeQA Pro | 57.9% | not listed | 48.1% | Databricks vision-based evaluation over U.S. Treasury Bulletin documents. |
| Toolathlon | 61.7% Pass@1 | 61.7% Pass@1 | 59.9% Pass@1 | 108 real-world tool-use tasks across 32 applications. |
| MCP Atlas | 83.3% | not listed | 82.2% | Multi-step MCP tool-use workflows over production-like server environments. |
The benchmark story is not one giant number. It is a pattern: Fable 5 is strongest where the task is long, tool-heavy, multimodal, ambiguous, or closer to real work than a single prompt-answer exchange. That is why simple smoke tests can undersell it.
Coding Benchmarks
Software engineering is the loudest launch signal. Anthropic reports that Fable 5 reaches 95.0% on SWE-bench Verified and 80.0% on SWE-bench Pro, while the system card places Opus 4.8 at 88.6% and 69.2% respectively. The bigger jump shows up on long-horizon agentic coding benchmarks where a model must investigate, patch, test, and recover over many steps.
| Benchmark | Fable 5 result | Official comparison |
|---|---|---|
| FrontierCode Diamond | Fable 5: 29.3 score / 30.2 pass rate | Opus 4.8: 13.4 / 14.5; GPT-5.5: 5.7 / 6.4 |
| FrontierCode Main | Fable 5: 46.3 score / 48.8 pass rate | Opus 4.8: 34.3 / 37.3; GPT-5.5: 25.5 / 28.2 |
| FrontierSWE | Fable 5 ranked #1 at 2.12 mean@5 | Opus 4.8 ranked #2 at 3.26; GPT-5.5 ranked #3 at 3.94 |
| CursorBench | Fable 5 scored 72.9% at max effort | The system card says it led GPT-5.5 by 8.6 points at that model's highest published effort. |
The practical read: do not evaluate Fable 5 only on short snippets, code formatting, or a handful of easy GitHub issues. The official docs say the teams seeing the best outcomes are giving Fable 5 harder, previously unsolved problems. That matches the benchmark pattern: Fable separates most clearly when the work requires persistence.
Long Context and Agentic Search
Fable 5 and Mythos 5 support a 1M token context window by default. The long-context results in the system card are mostly reported for Mythos 5, but they are still useful for understanding what the underlying model class is good at. On GraphWalks, Mythos 5 scored 91.1 F1 on the BFS 256K subset and 79.4 F1 on the BFS 1M subset, ahead of Opus 4.8 at 85.9 and 68.1. On the Parents 1M subset, Mythos 5 scored 97.5 F1 versus Opus 4.8 at 83.3.
On BrowseComp, Anthropic reports that multi-agent Mythos 5 reached 93.3% and that async subagents set the highest score among the tested harnesses. The important developer lesson is not just "use more agents." It is that multi-agent structure helped most on the hard tail: the system card says the largest latency gains came from problems that were already difficult for prior Claude runs.
Vision and Documents
Anthropic calls Fable 5 the new state-of-the-art model for vision tasks. The benchmark details are more grounded than that headline: Fable 5 scored 29.8% on GDP.pdf, a dense professional document benchmark, compared with Opus 4.8 at 22.5%, GPT-5.5 at 24.9%, and Gemini 3.1 Pro at 16.7%. On OfficeQA Pro, the Databricks vision-based evaluation put Fable 5 at 57.9%, ahead of Opus 4.8 at 48.1%.
The system card also reports strong Mythos 5 results on ChartMuseum, LAB-Bench FigQA, and CharXiv Reasoning. For Fable 5 specifically, biology-heavy image tasks can trigger safeguards, so the right conclusion is narrower: Fable 5 is excellent at practical vision/document workflows, but some scientific visual workflows may route through the safeguard path.
Professional Work
The most interesting benchmark category is professional work, because it looks less like a leaderboard and more like what paying users actually do. Anthropic reports Fable/Mythos 5 was preferred over Opus 4.8 in 74% of Real-World Finance v2 pairwise comparisons, with an Elo of 1,374 versus 1,222 for Opus 4.8. Vals AI's Finance Agent v2 evaluation put Fable at 56.31%, above Opus 4.8 at 53.92% and GPT-5.5 at 51.76%.
The legal and tool-use numbers are also useful. On Harvey's Legal Agent Benchmark, the system card reports 16.91% all-pass and 92.0% mean criterion-pass on the full public set in Anthropic's internal harness, plus 13.3% all-pass on Harvey's held-out set. On Toolathlon, Fable 5 scored 61.7% Pass@1 and used 19.8 average turns, while Opus 4.8 scored 59.9% Pass@1 and used 24.5 turns.
There is at least one official counterexample worth keeping: on Vending-Bench, Fable 5's best final balance was $5,680.26, slightly below Opus 4.8's $5,787.43. That is exactly why the system card matters. Fable 5 is not "strictly better on every possible task." It is a much stronger default for hard, long, agentic work, with workload-specific exceptions.
Science Caveat
The launch post and system card describe very strong Mythos 5 life-sciences results: drug-design acceleration, novel molecular-biology hypotheses, genomics research, and benchmark gains on BioMysteryBench, LatchBio Bioinformatics, structural biology, ProteinGym Hard, organic chemistry, protocol troubleshooting, and LABBench2.
For public Fable 5 users, the caveat is central. Fable 5's safeguards are deliberately broad around biology and chemistry, and Anthropic says some beneficial life-sciences tasks may trigger classifiers. If your product is biomedical, computational biology, chemistry, or cyber-adjacent, build the fallback path first and treat raw Fable 5 benchmark expectations carefully.
Official Images and Chart Data
Anthropic shipped several visuals with the launch article. The images below are the official hosted assets that matter most for a benchmark-based article. I am not re-hosting them here; the page references Anthropic's original URLs and links the source section at the end.




API, Availability, and Pricing
Claude Fable 5 is generally available on the Claude API, Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry. Claude Mythos 5 is not generally available; access is limited to approved customers through Project Glasswing and related trusted-access channels.
The official pricing table lists Fable 5 and Mythos 5 at $10 per million input tokens and $50 per million output tokens. Prompt-cache writes are $12.50 per MTok for a 5-minute cache and $20 per MTok for a 1-hour cache, while cache hits and refreshes are $1 per MTok. Batch usage is discounted to $5 input and $25 output per MTok.
Prompting Fable 5
The Fable-specific prompting guide says the model is strongest on problems that were previously too complex, too long-running, or too ambiguous for earlier models. It also warns that prompts and skills written for prior Claude models can be too prescriptive. The migration work is therefore not "add more instructions." It is often "remove old scaffolding and let the stronger model work."
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-fable-5",
max_tokens=64000,
output_config={"effort": "high"},
messages=[
{
"role": "user",
"content": "Analyze this migration plan, implement the safe parts, and verify with tests."
}
],
)Effort is now the main steering knob. Use high as a default for most hard work, xhigh for capability-sensitive jobs, and medium or low for routine work where latency and cost matter more. On hard tasks, individual turns can run for minutes, and autonomous runs can continue for hours. That means your product needs streaming, async job handling, progress indicators, and timeout settings that match the model you are actually using.
Three prompt changes matter most. First, ground progress claims in actual tool results so long runs do not drift into optimistic status updates. Second, state boundaries: what the model may edit, when it should ask, and what actions are out of scope. Third, stop asking it to reproduce internal reasoning. The docs warn that prompts asking for hidden reasoning can trigger a refusal category; if you need reasoning visibility, use summarized adaptive thinking and a send-to-user tool for progress updates.
Safeguards and Fallback
Fable 5 includes classifiers around cyber, biology and chemistry, distillation, and reasoning extraction. The API-level refusal docs say a refusal is a successful HTTP 200 response with stop_reason: "refusal", not a thrown error. The documented stop_details.category values include cyber, bio, and reasoning_extraction.
The safest production pattern is to configure fallback to Claude Opus 4.8. Server-side fallback is available in beta on the Claude API and Claude Platform on AWS using the server-side-fallback-2026-06-01 beta header; SDK middleware can handle client-side fallback for TypeScript, Python, Go, Java, and C#.
Migration Checklist
1. Change the model ID to claude-fable-5. 2. Set output_config.effort explicitly. 3. Remove old show-your-chain-of-thought instructions. 4. Increase client timeouts and support streaming/async runs. 5. Add progress reporting grounded in tool results. 6. Add explicit scope and permission boundaries. 7. Add memory or notes for long-running tasks. 8. Configure Opus 4.8 fallback and monitor refusal events. 9. Re-run your evals on hard tasks, not only smoke tests. 10. Check the 30-day data-retention requirement before production use.
Fable 5 is a model to evaluate on your hardest workflow, not just your cheapest benchmark. The official benchmark pattern says the advantage grows with long-horizon autonomy, professional deliverables, visual reasoning, tool use, and task ambiguity. That is also where the operational surface grows: cost controls, fallback handling, memory, and observability matter more than they did for short-turn chat.
FAQ
What is Claude Fable 5?
Claude Fable 5 is Anthropic's most capable widely released model, announced on June 9, 2026. It is a Mythos-class model with production safeguards for general use.
What is the Claude Fable 5 API model ID?
The Claude API model ID is claude-fable-5. The restricted sibling model is claude-mythos-5.
Is Claude Fable 5 the same as Claude Mythos 5?
They share the same underlying capabilities, but Claude Fable 5 includes safety classifiers. Claude Mythos 5 has safeguards lifted in some areas and is limited to approved Project Glasswing and trusted-access users.
How much does Claude Fable 5 cost?
Official pricing is $10 per million input tokens and $50 per million output tokens. Batch pricing is $5 per million input tokens and $25 per million output tokens.
What are the biggest Fable 5 benchmark wins?
The strongest official signals are in long-horizon coding, agentic terminal work, document reasoning, computer use, long-context reasoning, and professional workflows. Fable 5 scored 95.0% on SWE-bench Verified, 80.0% on SWE-bench Pro, 72.9% on CursorBench at max effort, and led FrontierCode in both Diamond and Main subsets.
What changes should developers make when prompting Fable 5?
Use effort as the main quality-latency-cost control, expect longer turns on hard tasks, remove old show-your-reasoning instructions, add explicit boundaries, use memory for long-running work, and configure fallback to Opus 4.8 for refused requests.
Official Sources
This article intentionally excludes community posts, press coverage, and unofficial benchmark commentary. All claims above are grounded in these official sources:
- Official Claude launch thread on X
- Anthropic launch post: Claude Fable 5 and Claude Mythos 5
- Claude Fable 5 and Claude Mythos 5 system card
- Claude docs: Introducing Claude Fable 5 and Claude Mythos 5
- Claude docs: Prompting Claude Fable 5
- Claude docs: Prompting best practices
- Claude docs: Models overview
- Claude docs: Pricing
- Claude docs: Refusals and fallback