claude-haiku-4-5 - ByteSpike

claude-haiku-4-5 is the small / fast / cheap Claude. It’s the right default for anything you’d run at scale where the prompt is short, the response is short, and you’d rather call the API ten thousand times an hour than three hundred. Vision and tool use are still included; what you give up is the bigger models’ reasoning depth on hard problems. Pricing:

1.00 / 1M input,

5.00 / 1M output, $0.10 / 1M cache read — see the rate card.

Protocols

Protocol	Path
Anthropic Messages	`POST https://llm.bytespike.ai/v1/messages`
OpenAI Chat Completions	`POST https://llm.bytespike.ai/v1/chat/completions`

Quickstart

curl https://llm.bytespike.ai/v1/messages \
  -H "x-api-key: $BYTESPIKE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Classify: refund / billing / technical / other.\n\nMy invoice charged twice." }
    ]
  }'

Capabilities

Capability	Supported
Chat completions	✅
Streaming (SSE)	✅
Vision (image input)	✅
Tool use (function calling)	✅ parallel
Prompt caching (cache_control)	✅
Extended thinking	—
Web search	—
JSON / structured output	✅
Context window	200K tokens

When to use

Classification, routing, triage. Tickets → categories. Emails → priority. Calls → next-best action.
Structured extraction at scale. PII redaction, entity extraction, schema-driven parsing on a firehose.
Cache-amortized agents. Big system prompt + many short user turns; cache_control on the system makes each turn ~10× cheaper after the first.
Vision OCR on cheap models. Haiku’s vision is good enough for receipts, invoices, screenshots — at a quarter of Sonnet’s price.

When not to use:

Hard reasoning — no extended thinking on Haiku; go to Sonnet or Opus.
Long-form writing — Haiku’s prose tier is below Sonnet.

claude-sonnet-4-6 — production mid-tier
claude-opus-4-8 — 200K-context flagship

claude-sonnet-4-6 gpt-5-2

​Protocols

​Quickstart

​Capabilities

​When to use

​Next

Protocols

Quickstart

Capabilities

When to use

Next