Skip to main content
Vendor: Anthropic Model ID: claude-opus-4-6 Capability: 200K context · tool use · vision · prompt caching · streaming · extended thinking Pricing: per-token, Opus tier (live rate) Opus 4.6 is the refinement step on 4.5 — tighter long-form prose and a measurable drop in hallucinations when summarising across many documents within its 200K context. It’s the right Opus for multi-document workflows that 4.5 was nearly good enough at: legal review, scientific literature synthesis, codebase audit reports. Opus 4.8 is the current flagship; 4.6 is kept available for teams who’ve validated against this specific version.

Request

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-6",
    "max_tokens": 16384,
    "messages": [
      {"role": "user", "content": "Synthesize the methodology sections of these 14 papers."}
    ]
  }'

Body parameters

FieldTypeRequiredDefaultNotes
modelstringyesclaude-opus-4-6
messagesarrayyesConversation history. Up to 200K tokens of input.
max_tokensintegeryesHard cap. Max for this model: 32768.
systemstring | arraynoArray form supports cache_control.
temperaturenumberno1.0Range 0.0–1.0.
top_pnumberno1.0Nucleus sampling.
toolsarraynoSupported.
tool_choiceobjectno{"type":"auto"}auto / any / tool (named).
thinkingobjectnoExtended-thinking budget.
streambooleannofalseSSE streaming.

Response

{
  "id": "msg_opus_…",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-6",
  "content": [
    {"type": "text", "text": "Across the 14 papers, three methodological themes emerge..."}
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 187430,
    "output_tokens": 4218
  }
}

Code examples

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-6",
    "max_tokens": 16384,
    "messages": [{"role": "user", "content": "Synthesize the methodology sections of these papers."}]
  }'

Cache control

{
  "model": "claude-opus-4-6",
  "system": [
    {
      "type": "text",
      "text": "<the corpus you keep referring to>",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [...]
}
Cache rate at the discounted tier visible in the pricing table.

Errors

CodeTriggerBilled?
400Body validation failedNo
401Missing / revoked keyNo
402Wallet exhaustedNo
413Input exceeds 200K tokensNo
429Rate-limitedNo
5xxUpstream provider issueNo (auto-retry envelope)

When to use

  • Multi-document synthesis where 4.5 was almost-but-not-quite reliable.
  • Long-form summarisation that needs tight, non-repetitive prose.
  • For the current flagship Opus, see Opus 4.8.
  • For the prior Opus, see Opus 4.5.
  • For mid-tier cost / latency, see Sonnet 4.6.

Limits

LimitValue
Context window200K tokens
Max output32768 tokens
Supports tool useYes
Supports visionYes
Supports streamingYes
Supports prompt cachingYes
Supports extended thinkingYes