Skip to main content
Vendor: Anthropic Model ID: claude-opus-4-5 Capability: 200K context · tool use · vision · prompt caching · streaming · extended thinking Pricing: per-token, Opus tier (live rate) Opus 4.5 was the first model in the 4-series Opus family, delivering deep long-form reasoning at Opus quality across a 200K-token context. For new work, prefer Opus 4.8 — the current flagship with measurably tighter output and better tool-use precision. 4.5 is kept available for teams who’ve already validated against it.

Request

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-5",
    "max_tokens": 16384,
    "messages": [
      {"role": "user", "content": "Review this 150-page deposition for inconsistencies in the dates."}
    ]
  }'

Body parameters

FieldTypeRequiredDefaultNotes
modelstringyesclaude-opus-4-5
messagesarrayyesConversation history. Up to 200K tokens of input.
max_tokensintegeryesHard cap. Max for this model: 32768.
systemstring | arraynoArray form supports cache_control.
temperaturenumberno1.0Range 0.0–1.0.
top_pnumberno1.0Nucleus sampling.
toolsarraynoSupported.
tool_choiceobjectno{"type":"auto"}auto / any / tool (named).
thinkingobjectnoExtended-thinking budget. See Anthropic thinking docs.
streambooleannofalseSSE streaming.

Response

{
  "id": "msg_opus_…",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-5",
  "content": [
    {"type": "thinking", "thinking": "<extended reasoning trace>"},
    {"type": "text", "text": "Three date inconsistencies on pages 42, 87, and 131..."}
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 168250,
    "output_tokens": 1872
  }
}

Code examples

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-5",
    "max_tokens": 16384,
    "messages": [{"role": "user", "content": "Review this deposition for date inconsistencies."}]
  }'

Cache control

Cache control on Opus is the most cost-significant setting in the 4-series. With large 200K-token contexts, cache reads can mean a 10× cost reduction across repeated agent turns. Cache rate visible in the pricing table.
{
  "model": "claude-opus-4-5",
  "system": [
    {
      "type": "text",
      "text": "<the 100K-token corpus you keep referring to>",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [...]
}

Errors

CodeTriggerBilled?
400Body validation failed (max_tokens too high, etc.)No
401Missing / revoked keyNo
402Wallet exhausted (Opus calls trip this faster than Sonnet)No
413Input exceeds 200K tokensNo
429Rate-limitedNo
5xxUpstream provider issueNo (auto-retry envelope)

When to use

  • Long-form reasoning within a 200K window (legal review, code-base audit, multi-document synthesis).
  • Multi-step prompts where Sonnet starts skipping steps.
  • For new work, prefer Opus 4.8 — current flagship, tighter output.
  • For mid-tier cost / latency, see Sonnet 4.6.

Limits

LimitValue
Context window200K tokens
Max output32768 tokens
Supports tool useYes
Supports visionYes
Supports streamingYes
Supports prompt cachingYes
Supports extended thinkingYes