claude-sonnet-4-6 - ByteSpike

claude-sonnet-4-6 is the production workhorse of the Claude family — the model most ByteSpike customers default to when they need a model that does everything at a mid-tier price. Vision, tools, prompt caching, extended thinking, and web search are all available at the same 200K context as Opus; Opus reserves the top of the quality envelope for hard reasoning. Pricing:

3.00 / 1M input,

15.00 / 1M output, $0.30 / 1M cache read — see the rate card.

Protocols

Protocol	Path
Anthropic Messages	`POST https://llm.bytespike.ai/v1/messages`
OpenAI Chat Completions	`POST https://llm.bytespike.ai/v1/chat/completions`

The two protocols produce equivalent responses for the same input; pick whichever your client already speaks.

Quickstart

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello, ByteSpike." }
    ]
  }'

Capabilities

Capability	Supported
Chat completions	✅
Streaming (SSE)	✅
Vision (image input)	✅
Tool use (function calling)	✅ parallel
Prompt caching (cache_control)	✅
Extended thinking	✅
Web search (web_search tool)	✅
JSON / structured output	✅
Context window	200K tokens

When to use

Default mid-tier. When you need a model that does everything — vision + tools + thinking + web search — at Sonnet pricing.
Agents. Use the Anthropic Messages endpoint; the tool_use and thinking blocks pass through transparently.
RAG. cache_control on a long system prompt amortizes the cost across many user turns. The first request pays the cache-write tier; subsequent reads from the same prefix pay $0.30/1M (10% of input).
Fresh-fact tasks. Add web_search to the tools array.

When not to use:

Hardest reasoning — go to claude-opus-4-8, the Claude flagship.
Cheapest-possible classification or routing — use claude-haiku-4-5 at one-quarter the price.

claude-opus-4-8 — flagship, 200K context
claude-haiku-4-5 — small, fast, cheap

claude-opus-4-7 claude-haiku-4-5

​Protocols

​Quickstart

​Capabilities

​When to use

​Next

Protocols

Quickstart

Capabilities

When to use

Next