claude-opus-4-5
Capability: 200K context · tool use · vision · prompt caching · streaming · extended thinking
Pricing: per-token, Opus tier (live rate)
Opus 4.5 was the first model in the 4-series Opus family, delivering
deep long-form reasoning at Opus quality across a 200K-token context.
For new work, prefer Opus 4.8 — the current
flagship with measurably tighter output and better tool-use precision.
4.5 is kept available for teams who’ve already validated against it.
Request
Body parameters
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | yes | — | claude-opus-4-5 |
messages | array | yes | — | Conversation history. Up to 200K tokens of input. |
max_tokens | integer | yes | — | Hard cap. Max for this model: 32768. |
system | string | array | no | — | Array form supports cache_control. |
temperature | number | no | 1.0 | Range 0.0–1.0. |
top_p | number | no | 1.0 | Nucleus sampling. |
tools | array | no | — | Supported. |
tool_choice | object | no | {"type":"auto"} | auto / any / tool (named). |
thinking | object | no | — | Extended-thinking budget. See Anthropic thinking docs. |
stream | boolean | no | false | SSE streaming. |
Response
Code examples
Cache control
Cache control on Opus is the most cost-significant setting in the 4-series. With large 200K-token contexts, cache reads can mean a 10× cost reduction across repeated agent turns. Cache rate visible in the pricing table.Errors
| Code | Trigger | Billed? |
|---|---|---|
| 400 | Body validation failed (max_tokens too high, etc.) | No |
| 401 | Missing / revoked key | No |
| 402 | Wallet exhausted (Opus calls trip this faster than Sonnet) | No |
| 413 | Input exceeds 200K tokens | No |
| 429 | Rate-limited | No |
| 5xx | Upstream provider issue | No (auto-retry envelope) |
When to use
- Long-form reasoning within a 200K window (legal review, code-base audit, multi-document synthesis).
- Multi-step prompts where Sonnet starts skipping steps.
- For new work, prefer Opus 4.8 — current flagship, tighter output.
- For mid-tier cost / latency, see Sonnet 4.6.
Limits
| Limit | Value |
|---|---|
| Context window | 200K tokens |
| Max output | 32768 tokens |
| Supports tool use | Yes |
| Supports vision | Yes |
| Supports streaming | Yes |
| Supports prompt caching | Yes |
| Supports extended thinking | Yes |