Quickstart
Make your first request in under two minutes.
Authentication
How API keys, group bindings, and rate limits work.
API Reference
23 endpoints, one base URL, one auth header.
Pricing
Per-token / per-call rates, no markup tiers.
Why ByteSpike
- Anthropic-compatible by default — keep your
tool_use,cache_control, andthinkingblocks. Same SDK, same retry semantics, every model. - Multimodal under one key — text, image, video — no per-vendor billing surface to assemble.
- Failures don’t bill — every non-2xx is free. Estimated credits ship in the response header so you can preview cost before user confirmation.
- Per-key controls — every API key carries its own quota (USD), rate-limit buckets (5h / 1d / 7d), IP allowlist/denylist, and optional expiry. Org wallets roll up across keys.
What’s behind the gateway
Three protocol surfaces, the full multimodal catalog, and a handful of utility endpoints — all served fromllm.bytespike.ai:
| Family | Endpoints |
|---|---|
| Text | POST /v1/messages (Anthropic), POST /v1/chat/completions (OpenAI), POST /v1/responses (OpenAI Responses), POST /v1beta/models/{model}:generateContent (Gemini Native) |
| Image | Seedream v4 / v4.5 / v5lite, GPT-Image-2 (+ official + 4o-image), Nano-Banana / Pro / v2 |
| Video | Sora-2 / 2-Pro, Veo-3.1 / 3.1-Fast, Seedance 1.5-Pro / Pro / Pro-Fast / Seedance2 / 2-Fast |
| Utility | GET /v1/models (list catalog), GET /v1/usage (request usage), POST /v1/tasks/{submit,query,cancel} (async multimodal), GET /v1/balance (free) |
Base URL
baseURL to the value above.
OpenAI SDKs work the same way — see Authentication for
the per-protocol header layout.