Skip to main content
ByteSpike returns errors in the envelope shape of whichever protocol you called — Anthropic-shape on /v1/messages, OpenAI-shape on /v1/chat/completions + /v1/responses + /v1/tasks/* + /v1/images/*, Google-shape on /v1beta/.... Failures never bill. This is a hard contract — X-Quota-Remaining-Credits doesn’t move on a non-2xx, no entry appears in /api/v1/me/usage, no row in /api/v1/me/billing/transactions.

Envelope shapes

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "invalid x-api-key"
  }
}
error.type values: invalid_request_error, authentication_error, permission_error, not_found_error, request_too_large, rate_limit_error, api_error, overloaded_error.

Status code matrix

400 Bad Request

Bad input. Never retry — the request will fail again with the same body.
OpenAI codeAnthropic typeMeaning
invalid_paraminvalid_request_errorMissing required field, wrong type, malformed JSON
model_not_foundnot_found_errorModel id doesn’t exist in the catalog
context_length_exceededinvalid_request_errorPrompt + max_tokens > model’s context window
duplicate_out_task_idn/aRe-submitting same out_task_id with different params (tasks API only)

401 Unauthorized

Auth problem. Never retry — fix the credentials first.
CodeCause
invalid_api_keyMissing / typo’d / revoked / expired
authentication_error (Anthropic)Same

402 Payment Required

You ran out of money or hit a quota. Conditional retry — only after topping up or raising the cap.
CodeCause
insufficient_balanceOrg wallet empty
quota_exceededKey’s quota (lifetime) cap reached

403 Forbidden

Authorized but not allowed. Never retry — adjust the key, group, or IP.
CodeCause
permission_deniedIP not in ip_whitelist / matches ip_blacklist
model_not_in_groupModel isn’t reachable from this key’s group_id — pick a different key or different model

404 Not Found

Endpoint or resource doesn’t exist. Never retry.
CodeCause
not_foundWrong route — verify the URL
task_not_found/v1/tasks/query or /cancel with a bad task_id / out_task_id

413 Request Entity Too Large

Body exceeded the route’s cap. Never retry with the same body.
CapHit by
1 MiB on /v1/tasks/*Inline base64 in params — upload to a URL instead
Larger on /v1/messages etcMostly only by deeply nested tool schemas; shrink the schema

429 Too Many Requests

Rate limit. Retry — use X-RateLimit-Reset to pick the backoff.
OpenAI codeCause
rate_limit_exceededOne of the 5h / 1d / 7d spend buckets is empty
concurrency_limitToo many in-flight requests on your account
See Rate limits for the backoff strategy.

500 Internal Server Error

Gateway-side fault. Retry with exponential backoff (1s, 2s, 4s, 8s, give up at 4 attempts).
CodeCause
internal_errorSomething blew up inside ByteSpike — should be rare; surfaces in our own logs

502 / 503 / 504 — Service errors

The request couldn’t be completed server-side. Retry with backoff; ByteSpike already retried internally before surfacing.
StatusCause
502 api_errorA malformed response was returned to the gateway
503 api_errorNo capacity in the key’s group can serve this model right now (all retried, all failed)
504 timeoutThe request timed out (>30s for text, >5min for image)
For 503 specifically, click Test next to the model in Console → Models — it runs a dial-test that confirms whether the key + group + model combination is actually viable.

Retry decision matrix

HTTPRetry?Backoff
400 / 401 / 403 / 404 / 413Non/a
402No (top up first)n/a
429 (rate limit)YesUse X-RateLimit-Reset
429 (concurrency)YesShort jittered, 1-3s
500YesExponential, give up at 4 attempts
502 / 503 / 504YesExponential, give up at 4 attempts
Mid-stream event: errorNo (request already settled, no charge)n/a

Idempotency on retries

The text endpoints (/v1/messages, /chat/completions, /responses) are not idempotent — retrying after a 200 will run the model twice. Don’t retry unless you got an error. The tasks API is idempotent via out_task_id. Send the same out_task_id on retry and the dispatcher returns the existing task instead of starting a new one.

Anthropic-specific: error event in SSE

When a stream fails mid-flight, you’ll see a terminal event: error instead of event: message_stop:
event: error
data: {"type":"error","error":{"type":"overloaded_error","message":"service overloaded"}}
The partial output already streamed is yours. No charge. Resend the request from scratch — there’s no resume semantic.

OpenAI-specific: error field in final frame

data: {"id":"chatcmpl-...","choices":[],"error":{"type":"api_error","message":"service error"}}

data: [DONE]
Same semantics — no charge, no resume.

Reading errors programmatically

import requests

r = requests.post(URL, json=payload, headers=HEADERS)
if r.ok:
    return r.json()

err = r.json().get("error", {})

# Branch on protocol shape
if "code" in err:        # OpenAI-shape
    code = err["code"]
elif "type" in err:      # Anthropic-shape or OpenAI "type" form
    code = err["type"]
else:                    # Google-shape
    code = err.get("status", "")

if r.status_code == 429:
    sleep_until = int(r.headers.get("X-RateLimit-Reset", time.time() + 60))
    time.sleep(max(1, sleep_until - int(time.time())))
    # retry
elif r.status_code in (500, 502, 503, 504):
    # exponential backoff + retry
    pass
else:
    raise APIError(r.status_code, code, err.get("message"))