Skip to main content
If you’re already on OpenAI’s API, switching to ByteSpike means changing three things and shipping. The OpenAI SDK works unchanged; the request shape is identical; only the endpoint, key, and model id differ.

The three changes

- client = OpenAI(api_key="sk-...")
+ client = OpenAI(
+   base_url="https://llm.bytespike.ai/v1",
+   api_key="sk-byts-...",
+ )
That’s it. The rest of your code — tool definitions, structured outputs, streaming, retries — runs untouched.

Model id mapping

ByteSpike’s catalog uses model ids that mostly match OpenAI’s, plus ids for non-OpenAI providers:
OpenAI id you were usingByteSpike id (drop-in)
gpt-4ogpt-5-4
gpt-4o-minigpt-5-4-mini
gpt-5 (latest)gpt-5-5
gpt-5-nanogpt-5-4-nano
gpt-5-minigpt-5-4-mini
o1-preview / o1gpt-5-4-mini (closest reasoning capability) — pin to a specific id for repeatability
gpt-image-1gpt-image-2
The full live catalog is at GET /v1/models or bytespike.ai/pricing.

What you gain

OpenAI directByteSpike
Only OpenAI modelsEvery frontier model under one key (Claude, Gemini, DeepSeek, …)
Per-vendor billingOne wallet, one invoice
Per-vendor rate limitsOne rate-limit envelope, set per-key
Failures billFailures don’t bill
Stripe + invoice manualConsole + Stripe handled

What stays the same

  • OpenAI SDK — Python, TypeScript, Go, every official client. Just change base_url.
  • Request body shapemessages, tools, tool_choice, response_format, stream, etc. — identical.
  • Response shapechoices[].message.content, tool_calls, usage — identical.
  • Streaming protocol — SSE with data: {json} + data: [DONE] — byte-for-byte compatible.
  • Tool usetools array with function objects + tool_choice — identical.
  • Structured outputresponse_format: {type: "json_schema", ...} — identical (on GPT-native paths; translated to the model’s native shape on Claude/Gemini routes).

Concrete examples

Chat completion

# Was:
from openai import OpenAI
client = OpenAI()
r = client.chat.completions.create(
    model="gpt-5-4",
    messages=[{"role": "user", "content": "ping"}],
)

# Now:
from openai import OpenAI
client = OpenAI(
    base_url="https://llm.bytespike.ai/v1",
    api_key="sk-byts-...",
)
r = client.chat.completions.create(
    model="gpt-5-4",  # same id; or swap to gpt-5-5 / claude-sonnet-4-6 / gemini-3-1-pro
    messages=[{"role": "user", "content": "ping"}],
)

Tool use

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"]
        }
    }
}]

# Same call shape — works against gpt-5, claude, gemini, etc.
r = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "weather in Tokyo?"}],
    tools=tools,
)

Image generation

# OpenAI's images.generate works against ByteSpike's catalog:
r = client.images.generate(
    model="seedream-4-5",      # or "nano-banana", "gpt-image-2", ...
    prompt="A timelapse of Tokyo at sunset",
    size="1024x1024",
    n=1,
)
print(r.data[0].url)

Responses API (o-series + GPT-5 + agents)

r = client.responses.create(
    model="gpt-5-5",
    input="Summarize the doc at https://example.com/doc",
    reasoning={"effort": "medium"},
)
Works against every model in the catalog — ByteSpike translates Responses → each model’s native shape under the hood for non-GPT ids.

Things to double-check

Each ByteSpike key is bound to one routing group. Pick the group that includes the models you’ll call. If you need GPT-5
  • Claude in the same client, either create two keys (one per group) or pick a multi-group key tier. See Models.
Token-count math is identical (OpenAI’s tokenizer for OpenAI models). Prices may differ — usually lower at ByteSpike for non-OpenAI providers, identical for OpenAI passthroughs. Live rates: bytespike.ai/pricing.
OpenAI’s organization and project headers are accepted but ignored — ByteSpike has its own org model. To attribute spend per project, create one key per project with its own quota cap.
ByteSpike doesn’t expose OpenAI’s stateful APIs (/files, /assistants, /vector_stores). If you depend on them, run them direct against OpenAI; chat completions can still be on ByteSpike. The two clients coexist fine.
Same as above — no fine-tuning surface on ByteSpike. Fine-tune direct, deploy direct, do chat through ByteSpike.

Step-by-step

  1. Sign up at console.bytespike.ai — see Register.
  2. Top up $5+ for a real test — see Top up.
  3. Create a key in the group matching your model needs (default / claude-default / etc).
  4. Change base_url + api_key in one file in your codebase.
  5. Test against gpt-5-4 first (drop-in) to confirm wiring before changing models.
  6. Swap in non-OpenAI ids (claude-sonnet-4-6, gemini-3-1-pro, deepseek-v4-pro) one at a time.
  7. Compare token counts + latency on a representative sample. ByteSpike usually wins on the non-OpenAI ids; OpenAI passthrough is at parity.

Reverse migration

If you want to move back to OpenAI direct, the same three changes flip the other way. Keep both clients in your codebase if you want A/B routing — the SDK constructor is the only point of divergence.

Next

Migrate from Anthropic

Same idea, Anthropic side.

Configure your client

Full per-SDK setup details.

/chat/completions reference

Request / response / streaming protocol.

Models

Full catalog you can call after migrating.