/images/generations, video uses /tasks/submit, etc.) but
the auth and billing surfaces are unified.
Family overview
| Family | Sync? | Key endpoints |
|---|---|---|
| Text | Yes (with optional streaming) | /v1/messages, /v1/chat/completions, /v1/responses, /v1beta/models/{model}:generateContent |
| Image (sync) | Yes (≤30s) | /v1/images/generations, /v1/images/edits |
| Image / Video (async) | No, async | POST /v1/tasks/submit → poll /v1/tasks/query (or SSE /v1/tasks/stream/:id) |
| Utility | Yes (free) | /v1/models, /v1/usage, /v1/balance, /v1/tasks/{query,cancel} |
Why async for video
Video generation typically takes 30–180 seconds of GPU time. Holding an HTTP connection that long is fragile (proxies time out, retries multiply cost). The async pattern is:POST /v1/tasks/submit→ returnstask_id+estimated_credits+estimated_secondsimmediately- Poll
POST /v1/tasks/querywith{"task_id": "..."}(free) on a cadence matched to the ETA, or streamGET /v1/tasks/stream/{task_id}over SSE for push delivery, or register acallback_urlon submit and skip polling entirely - When
status == "completed", theoutputarray carries the result URLs
pending → running → completed (happy path),
pending → running → failed (unbilled), or
pending|running → cancelled (manual cancel, billing depends on the
model’s refund policy).
Mixing modalities in one app
Each API key is bound to one routing group on ByteSpike, which determines which models it can reach. For an app spanning multiple model families, create one key per group:GET /api/v1/groups/available
or the Create key dialog in the console. To attribute spend per
pipeline, hit GET /api/v1/usage
filtered by api_key_id.