Anthropic released Claude Opus 4.7 on April 16, 2026. Same price as 4.6. But three times more production bugs resolved on the benchmark that matters most to developers.

Here is everything that changed — and the one real catch.

The 3x Claim: What It Actually Means

The “3x” number comes from SWE-bench Verified, specifically the Rakuten production bug subset.

SWE-bench is the standard benchmark for AI coding. It gives a model real GitHub issues from real open-source projects. The model has to produce a working fix — not a suggestion, not a comment. A fix.

The Rakuten subset focuses on production-level bugs: the kind you actually encounter in a running system. Opus 4.7 resolved three times as many of them as Opus 4.6.

That is a meaningful number because it maps to real work. Production bugs are messier than toy problems. They involve multiple files, subtle state bugs, and unclear error messages.

Vision: 54.5% → 98.5%

This is the most dramatic upgrade in the release.

Opus 4.6Opus 4.7
Input resolution~860px2,576px
Visual acuity54.5%98.5%

At 860 pixels, fine detail gets compressed. Small axis labels on charts, dense text in UI screenshots, architecture diagram annotations — Opus 4.6 would guess or ask for clarification.

At 2,576 pixels (3x the capacity), the model reads the actual values.

What this unlocks in practice:

  • Sending an error screenshot and getting the stack trace read accurately
  • Sharing a Figma export and getting a component breakdown without cropping
  • Uploading a dense Grafana dashboard and getting specific metrics read back
  • Sharing an architecture diagram and getting the flow analyzed correctly

If you have been manually cropping or re-sending screenshots to work around Claude’s vision limits, that workaround is mostly gone now.

Coding: +13% and /ultrareview

On Anthropic’s complex coding assessment suite — 93 tasks covering multi-file refactors, API integrations, and architecture decisions — Opus 4.7 scores 13% higher than 4.6.

The bigger addition for daily use is a new command in Claude Code:

> /ultrareview src/auth/middleware.ts

/ultrareview is a dedicated code review session. It is different from asking Claude to “review this file.” It performs a structured pass covering:

  • Security vulnerabilities (injection risks, missing auth checks, exposed secrets)
  • Performance bottlenecks (N+1 queries, unnecessary allocations, blocking I/O)
  • Architectural anti-patterns (tight coupling, missing abstractions, wrong layer responsibilities)
  • Edge case handling (null paths, empty inputs, concurrent access)

It reads like a senior engineer review rather than a linter. Useful before a PR or before touching a file you haven’t written yourself.

Pricing: Unchanged

Input:  $5.00 per 1M tokens
Output: $25.00 per 1M tokens

Identical to Opus 4.6. Prompt caching and batch API pricing (50% off for async) are also unchanged.

The Catch: New Tokenizer

Anthropic updated the tokenizer in Opus 4.7. The new tokenizer handles code and technical content differently — and it produces up to 35% more tokens for the same input text.

Price per token: unchanged.
Token count for the same prompt: higher.

Input typeExpected token increase
Plain English text0–5%
Code files15–35%
Markdown docs10–25%
Structured JSON15–30%

The practical impact: if your workload is mostly code-heavy prompts — sending full files, documentation, structured data — your bill may go up noticeably even though the per-token rate is identical.

How to check before switching:

  1. Go to the Anthropic console → Usage
  2. Find your average token count per request over the last 7 days
  3. Run a sample of your typical prompts through the Anthropic tokenizer to compare 4.6 vs 4.7 counts
  4. Multiply the difference by your monthly volume

Do this before switching production workloads. The upgrade is worth it for most use cases — but you want to know the actual cost delta first.

Opus 4.6 vs Opus 4.7: Side by Side

FeatureOpus 4.6Opus 4.7
SWE-bench production bugsBaseline3× higher
Visual acuity54.5%98.5%
Vision resolution~860px2,576px
Coding benchmarkBaseline+13%
Input price$5 / 1M$5 / 1M
Output price$25 / 1M$25 / 1M
Tokenizer (code)+15–35% usage

Should You Upgrade?

Yes, upgrade if:

  • You analyze screenshots, UI designs, charts, or architecture diagrams
  • You care about production bug resolution quality more than marginal cost
  • You use Claude Code and want /ultrareview in your workflow
  • You work with multi-file refactors where the +13% coding improvement compounds

Test first if:

  • Your prompts are primarily large code files or technical documentation
  • You run high volumes through the API and cost-per-task is tightly tracked
  • You are on a fixed budget for AI tools

Stay on 4.6 if:

  • Cost-per-task is your primary constraint and your current workload is code-heavy
  • You have not yet measured the tokenizer impact on your specific inputs

The vision upgrade alone justifies switching for anyone doing image analysis. For pure coding workloads, measure the tokenizer impact first.


Source: Anthropic, April 16 2026 — anthropic.com/news/claude-opus-4-7