Skip to main content
OpenAI’s Codex CLI speaks the Responses API natively. Redirect its base URL at ByteSpike and the CLI works against the GPT-5 / o-series catalog directly, plus every other model the gateway translates into Responses shape (Claude family, Gemini family).

Prerequisites

  • A ByteSpike account + a key. For GPT-5 / o-series, a key in any group that serves them (typically the OpenAI / default group). For reaching Claude or Gemini via Responses translation, the key’s group must include those models.
  • Codex CLI installed:
    npm install -g @openai/codex-cli
    

Configure

Codex CLI reads OPENAI_API_KEY + OPENAI_BASE_URL. Set both:
export OPENAI_BASE_URL="https://llm.bytespike.ai/v1"
export OPENAI_API_KEY="sk-byts-..."
If Codex CLI in your version uses a Responses-specific base URL override flag (some prerelease builds use --responses-base-url), pass https://llm.bytespike.ai/v1 explicitly:
codex --responses-base-url https://llm.bytespike.ai/v1 "..."

Verify

codex "list files in this directory"
Codex will run a small Responses-API call against the gateway. You should see normal tool-use output (file listing). Errors point you at config: 401 = wrong key, 402 = wallet empty, 400 model_not_allowed = the default model isn’t in your key’s group.

Switching models

Most Codex versions accept --model:
codex --model gpt-5-5         "refactor this function"
codex --model gpt-5-4         "explain this commit"
codex --model claude-opus-4-8 "review my PR description"
All three above work — gpt-5-* is the OpenAI-native path; the Claude id triggers ByteSpike’s Responses→Messages translation under the hood. The CLI never knows the difference.

Common configs

Pick a key whose group includes the models you want to bounce between (typically the all-protocols group if your account has one). The CLI can then --model between OpenAI native and Claude/Gemini translated within a single session.
Pass through Codex’s reasoning controls:
codex --model gpt-5-5 --reasoning-effort high "..."
The gateway forwards the reasoning.effort field verbatim. Pricing reflects the higher reasoning-token output — see pricing.
Codex’s --response-format json_schema=... flag is forwarded as response_format to the gateway, which handles it correctly per model (GPT supports the JSON Schema mode directly; the Claude path falls back to text + parse).
Codex’s tool definitions (--tools=...) are forwarded as the Responses-API tools array. On the GPT path they’re native; on the Claude/Gemini path the gateway translates to the target model’s tool format (input_schema for Claude, functionDeclarations for Gemini).

Codex vs Claude Code — when to use which

StrengthClaude CodeCodex CLI
Natural model fitClaude family (Anthropic Messages native)GPT-5 + o-series (OpenAI Responses native)
Tool-use schemaCleaner — Anthropic uses raw JSON SchemaMore noise — OpenAI wraps tool_calls in strings
Reasoning controlsthinking blocks on Opus / Sonnet 4.xreasoning.effort: low/medium/high on GPT-5
Cache controlcache_control blocks (Anthropic native)n/a
JSON modeBoth work via response_formatBoth work; OpenAI more mature
Many teams keep both installed and pick per task. Use one ByteSpike account, generate one key per CLI bound to its preferred group.

Troubleshooting

SymptomCauseFix
401 invalid x-api-keyWrong keyRecopy from console
402 insufficient_balanceWallet emptyTop up
400 model_not_allowedModel not in key’s groupSwitch group or model
Tool calls returning garbageModel path doesn’t support the tool schema (rare on translated Claude routes)Switch to GPT-5 native or simplify the tool schema
Stall during streamingProxy bufferingAdd llm.bytespike.ai to NO_PROXY

Next

Claude Code CLI

Anthropic’s coding CLI, same kind of setup.

/responses reference

The protocol Codex speaks.

Models

GPT-5 / o-series catalog.

Cursor IDE

Editor-level integration.