Claude Opus 4.5 - ByteSpike

Vendor: Anthropic Model ID: claude-opus-4-5 Capability: 200K context · tool use · vision · prompt caching · streaming · extended thinking Pricing: per-token, Opus tier (live rate) Opus 4.5 was the first model in the 4-series Opus family, delivering deep long-form reasoning at Opus quality across a 200K-token context. For new work, prefer Opus 4.8 — the current flagship with measurably tighter output and better tool-use precision. 4.5 is kept available for teams who’ve already validated against it.

Request

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-5",
    "max_tokens": 16384,
    "messages": [
      {"role": "user", "content": "Review this 150-page deposition for inconsistencies in the dates."}
    ]
  }'

Body parameters

Field	Type	Required	Default	Notes
`model`	string	yes	—	`claude-opus-4-5`
`messages`	array	yes	—	Conversation history. Up to 200K tokens of input.
`max_tokens`	integer	yes	—	Hard cap. Max for this model: 32768.
`system`	string \| array	no	—	Array form supports `cache_control`.
`temperature`	number	no	1.0	Range 0.0–1.0.
`top_p`	number	no	1.0	Nucleus sampling.
`tools`	array	no	—	Supported.
`tool_choice`	object	no	`{"type":"auto"}`	`auto` / `any` / `tool` (named).
`thinking`	object	no	—	Extended-thinking budget. See Anthropic thinking docs.
`stream`	boolean	no	false	SSE streaming.

Response

{
  "id": "msg_opus_…",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-5",
  "content": [
    {"type": "thinking", "thinking": "<extended reasoning trace>"},
    {"type": "text", "text": "Three date inconsistencies on pages 42, 87, and 131..."}
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 168250,
    "output_tokens": 1872
  }
}

Code examples

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-5",
    "max_tokens": 16384,
    "messages": [{"role": "user", "content": "Review this deposition for date inconsistencies."}]
  }'

Cache control

Cache control on Opus is the most cost-significant setting in the 4-series. With large 200K-token contexts, cache reads can mean a 10× cost reduction across repeated agent turns. Cache rate visible in the pricing table.

{
  "model": "claude-opus-4-5",
  "system": [
    {
      "type": "text",
      "text": "<the 100K-token corpus you keep referring to>",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [...]
}

Errors

Code	Trigger	Billed?
400	Body validation failed (`max_tokens` too high, etc.)	No
401	Missing / revoked key	No
402	Wallet exhausted (Opus calls trip this faster than Sonnet)	No
413	Input exceeds 200K tokens	No
429	Rate-limited	No
5xx	Upstream provider issue	No (auto-retry envelope)

When to use

Long-form reasoning within a 200K window (legal review, code-base audit, multi-document synthesis).
Multi-step prompts where Sonnet starts skipping steps.
For new work, prefer Opus 4.8 — current flagship, tighter output.
For mid-tier cost / latency, see Sonnet 4.6.

Limits

Limit	Value
Context window	200K tokens
Max output	32768 tokens
Supports tool use	Yes
Supports vision	Yes
Supports streaming	Yes
Supports prompt caching	Yes
Supports extended thinking	Yes

​Request

​Body parameters

​Response

​Code examples

​Cache control

​Errors

​When to use

​Limits