claude-opus-4-6
Capability: 200K context · tool use · vision · prompt caching · streaming · extended thinking
Pricing: per-token, Opus tier (live rate)
Opus 4.6 is the refinement step on 4.5 — tighter
long-form prose and a measurable drop in hallucinations when summarising
across many documents within its 200K context. It’s the right Opus for
multi-document workflows that 4.5 was nearly good enough at: legal review,
scientific literature synthesis, codebase audit reports.
Opus 4.8 is the current flagship; 4.6 is kept
available for teams who’ve validated against this specific version.
Request
Body parameters
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | yes | — | claude-opus-4-6 |
messages | array | yes | — | Conversation history. Up to 200K tokens of input. |
max_tokens | integer | yes | — | Hard cap. Max for this model: 32768. |
system | string | array | no | — | Array form supports cache_control. |
temperature | number | no | 1.0 | Range 0.0–1.0. |
top_p | number | no | 1.0 | Nucleus sampling. |
tools | array | no | — | Supported. |
tool_choice | object | no | {"type":"auto"} | auto / any / tool (named). |
thinking | object | no | — | Extended-thinking budget. |
stream | boolean | no | false | SSE streaming. |
Response
Code examples
Cache control
Errors
| Code | Trigger | Billed? |
|---|---|---|
| 400 | Body validation failed | No |
| 401 | Missing / revoked key | No |
| 402 | Wallet exhausted | No |
| 413 | Input exceeds 200K tokens | No |
| 429 | Rate-limited | No |
| 5xx | Upstream provider issue | No (auto-retry envelope) |
When to use
- Multi-document synthesis where 4.5 was almost-but-not-quite reliable.
- Long-form summarisation that needs tight, non-repetitive prose.
- For the current flagship Opus, see Opus 4.8.
- For the prior Opus, see Opus 4.5.
- For mid-tier cost / latency, see Sonnet 4.6.
Limits
| Limit | Value |
|---|---|
| Context window | 200K tokens |
| Max output | 32768 tokens |
| Supports tool use | Yes |
| Supports vision | Yes |
| Supports streaming | Yes |
| Supports prompt caching | Yes |
| Supports extended thinking | Yes |