The two env vars
Model id mapping
ByteSpike uses Anthropic’s own model ids verbatim, plus ids for non-Anthropic providers reachable via the Messages shape:| Anthropic id you were using | ByteSpike id (drop-in) |
|---|---|
claude-3-5-sonnet-20241022 | claude-sonnet-4-6 (current flagship) |
claude-3-7-sonnet-20250219 | claude-sonnet-4-6 |
claude-sonnet-4-20250514 | claude-sonnet-4-6 |
claude-opus-4-20250514 | claude-opus-4-8 |
claude-3-5-haiku-20241022 | claude-haiku-4-5 |
| Cross-vendor (Messages-API shape via translation) |
|---|
deepseek-v3-anthropic |
deepseek-v4-pro (translated) |
gemini-3-1-pro (translated — :translated suffix in protocols_aggregate) |
gpt-5-5 (translated — caveats on tool schema fidelity) |
What you gain
| Anthropic direct | ByteSpike |
|---|---|
| Only Claude family | Claude + cross-vendor models via Messages shape |
| Anthropic billing | One ByteSpike wallet covers everything |
| Tier rate limits | Per-key rate limits (5h / 1d / 7d, all configurable) |
| Failures occasionally bill | Failures never bill |
| Prompt caching native | Prompt caching preserved end-to-end |
What stays the same
- Anthropic SDK — Python, TypeScript, every official client
- Messages shape —
messages,system,tools,tool_choice,max_tokens,stream, identical - Tool use —
input_schemaJSON Schema format,tool_useblocks,tool_resultblocks — identical - Prompt caching —
cache_controlblocks onsystem/tools/ messages — preserved end-to-end - Extended thinking —
thinkingblocks on Opus / Sonnet 4.x — preserved - Streaming — SSE with Anthropic event names (
message_start/content_block_delta/ etc) — byte-for-byte compatible
Concrete examples
Messages
Tool use
claude-*, deepseek-v3-anthropic, and
the translated routes (gemini-3-1-pro, gpt-5-5 via Messages shape).
Prompt caching
Extended thinking (Opus / Sonnet 4.x)
Things to double-check
Model availability per key
Model availability per key
Pick a routing group on the key that includes the models you
want.
claude-default covers the Claude family. For
cross-vendor Messages-shape access (DeepSeek-Anthropic, Gemini
via translation), pick a group that includes those — usually
a multi-vendor group or the default group on org-tier
accounts.Token counts and prices
Token counts and prices
Anthropic’s tokenizer applies to Claude models — counts match
Anthropic direct. Translated routes (Gemini, GPT via Messages
shape) bill at the underlying model’s token rate but the count
method follows that model — there can be small differences.
Anthropic Workbench-only features
Anthropic Workbench-only features
ByteSpike doesn’t replicate Anthropic Workbench (Console UI for
prompts). If you rely on it for prompt development, develop
direct on Anthropic and deploy the prompt against ByteSpike.
`anthropic-beta` header
`anthropic-beta` header
Forwarded verbatim to the model. Beta features Anthropic
gates with this header work the same way through ByteSpike.
Message Batches API
Message Batches API
Anthropic’s
/v1/messages/batches is not currently exposed
on ByteSpike. Use the synchronous endpoint or our async
/v1/tasks/* flow for queueable work.Step-by-step
- Sign up at console.bytespike.ai — see Register.
- Top up $5+ — see Top up.
- Create a key in
claude-default(or a multi-vendor group). - Set env vars —
ANTHROPIC_BASE_URL+ANTHROPIC_API_KEY— system-wide or in.envrc. - Run your existing script unchanged to confirm Claude calls still work.
- Try cross-vendor ids (
deepseek-v3-anthropic,gemini-3-1-pro) one at a time. - Verify caching is preserved —
usage.cache_read_input_tokensshould populate just like Anthropic direct.
Reverse migration
Two env-var deletes, you’re back on Anthropic direct. Keep both configs in your secrets manager if you want A/B routing.Next
Migrate from OpenAI
Same idea, OpenAI side.
Claude Code CLI
CLI-specific config.
/messages reference
Full Messages-API protocol.
Endpoint types
How cross-protocol translation works under the hood.