Get the latest on AI, LLMs & developer tools
New MCP servers, model updates, and guides like this one — delivered weekly.
What It Is
Claude Opus 4.8 is Anthropic's latest generally available Opus model. The official docs describe it as the company's most capable model for complex reasoning, long-horizon agentic coding, and high-autonomy work. In plain terms: this release is aimed at jobs where the model must keep state, use tools, inspect its own work, and continue for longer than a normal chat turn.
The model ID is claude-opus-4-8. The docs list a 1M token context window on the Claude API, Amazon Bedrock, and Vertex AI, with a 200k token context window on Microsoft Foundry at launch. Max synchronous Messages API output is 128k tokens.
Model: Claude Opus 4.8 API ID: claude-opus-4-8 Context: 1M tokens on Claude API, Bedrock, Vertex AI Output: 128k tokens on synchronous Messages API Default effort: high Regular API: $5 input / $25 output per MTok Fast mode: $10 input / $50 output per MTok
Official Announcement
The launch started with the official Claude account on X and the Anthropic launch post. The claim is specific: Opus 4.8 builds on Opus 4.7 with sharper judgment, more honesty about progress, and the ability to work independently for longer.
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
— Claude (@claudeai)May 28, 2026
Available today at the same price. pic.twitter.com/EufxL7T1kb
Anthropic's launch post adds three adjacent product changes: effort control in claude.ai and Cowork, dynamic workflows in Claude Code, and fast mode for Opus 4.8. That matters because the model release is not just a checkpoint swap. It changes how developers should set effort, cache prompts, and structure longer agent runs.
Benchmark Table
Anthropic's system card gives the most useful verified table. The numbers below are copied as facts from the official system card and should be read as pre-deployment evaluations, not a promise that every production workload improves by the same margin.
| Evaluation | Opus 4.8 | Opus 4.7 | What to notice |
|---|---|---|---|
| SWE-bench Verified | 88.6 | 87.6 | Small lift on verified real GitHub issues. |
| SWE-bench Pro | 69.2 | 64.3 | The headline coding jump from the ClaudeDevs thread. |
| Terminal-Bench 2.1 | 74.6 | 66.1 | Meaningful gain on command-line agent tasks. |
| OSWorld-Verified | 83.4 | 82.8 | Small gain on real Ubuntu computer-use tasks. |
| GraphWalks BFS 256K | 85.9 | 76.9 | Long-context graph traversal improves sharply. |
| GraphWalks Parents 256K | 99.3 | 93.6 | Strong long-context state-tracking result. |
The system card also reports that Opus 4.8 ranks first on FrontierSWE by both mean@5 and best@5. That benchmark gives agents 20 hours per task, so it lines up with Anthropic's positioning: this release is strongest when the work is long, messy, and tool-heavy.
What Changed
For developers, the most important changes are not cosmetic. Opus 4.8 defaults to high effort on all surfaces, supports mid-conversation system messages, lowers the minimum prompt-cache length to 1,024 tokens, and keeps the Opus 4.7 constraints around sampling and thinking. If your request still sends temperature, top_p, or top_k with non-default values, the Messages API rejects it.
Manual extended thinking budgets are also not supported on Opus 4.8. Use adaptive thinking with thinking: {type: "adaptive"}, then control depth through output_config.effort. Anthropic recommends xhigh for coding and agentic work, with high as the default for most intelligence-sensitive tasks.
client.messages.create({
model: "claude-opus-4-8",
max_tokens: 64000,
thinking: { type: "adaptive" },
output_config: { effort: "xhigh" },
messages: [{ role: "user", content: "Audit this migration plan." }]
})Pricing
Regular Opus 4.8 API pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Fast mode for Opus 4.8 is cheaper than fast mode was for prior Opus models: the launch post lists $10 per million input tokens and $50 per million output tokens.
The practical reading is simple. Standard mode is the default for long autonomous work. Fast mode is for latency-sensitive interaction, live debugging, and rapid iteration where paying more for output speed is worth it.
Safety Notes
The official system card is not pure launch marketing. It says Opus 4.8 improves over Opus 4.7 on most alignment measures and is much less likely to leave flaws in its own code unreported. The launch post summarizes that as roughly four times less likely than Opus 4.7 to let flaws in written code pass unremarked.
The same system card also flags a caveat: Opus 4.8 was somewhat less robust than Opus 4.7 in several agentic prompt-injection contexts before additional safeguards. Anthropic says deployed safeguards close that gap in practice. If you are building an agent that reads untrusted webpages, emails, tickets, or repository content, this caveat belongs in your threat model.
Who Should Use It
Use Opus 4.8 first for long coding sessions, repo-scale refactors, deep code review, professional document analysis, research with many sources, and agent loops that need a stable plan over many tool calls. Use Sonnet or Haiku when latency, cost, or high-volume throughput matters more than maximum reasoning quality.
The verdict is narrow but clear: Opus 4.8 is not a dramatic new product category. It is a better Opus for the places where Opus already mattered: hard engineering work, long context, and agents that must admit uncertainty before they break things.
Migration Notes
- Update the model string to
claude-opus-4-8. - Remove non-default sampling parameters if you still pass them.
- Use adaptive thinking, not manual
budget_tokens. - Set effort explicitly for production workloads.
- Remove old 1M-context beta headers on Claude API, Bedrock, and Vertex AI.
- Consider mid-conversation system messages for long-running harnesses.
- Re-baseline cost and latency because effort levels changed.
FAQ
What is Claude Opus 4.8?
Claude Opus 4.8 is Anthropic's newest generally available Opus model, launched on May 28, 2026. Anthropic positions it for complex reasoning, long-horizon agentic coding, and high-autonomy work.
What is the API model ID for Claude Opus 4.8?
The Claude API model ID is claude-opus-4-8. Anthropic also lists matching first-party IDs for Vertex AI and Claude Platform on AWS, plus a Bedrock-specific ID format.
How much does Claude Opus 4.8 cost?
Regular API usage is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Fast mode for Opus 4.8 is $10 per million input tokens and $50 per million output tokens.
What is the biggest developer change in Opus 4.8?
The biggest platform changes are high effort by default, 1M context by default on the Claude API, adaptive thinking, mid-conversation system messages, lower prompt-cache minimums, and better long-context recovery.
Is Claude Opus 4.8 better than Opus 4.7?
Anthropic calls it a modest but tangible improvement. The system card reports higher scores than Opus 4.7 across most capability evaluations, including SWE-bench Pro, Terminal-Bench 2.1, and GraphWalks.
Official Sources
- Anthropic: Introducing Claude Opus 4.8
- Anthropic: Claude Opus 4.8 System Card
- Claude Docs: What's new in Claude Opus 4.8
- Claude Docs: Models overview
- Official Claude launch thread on X
- Official ClaudeDevs developer thread on X
Related: the Claude Code dynamic workflows guide and the Opus 4.8 API migration guide.
