Skip to main content
gpt-5-4 is the GPT-5 production workhorse — mid-tier on the price curve, full feature coverage. Reach for it when you want the GPT-5 family’s reasoning capability and tighter structured-output validation at a mid-tier rate. Pricing: 2.50/1Minput,2.50 / 1M input, 15.00 / 1M output, $0.25 / 1M cache read — see the rate card.

Protocols

ProtocolPath
OpenAI Chat CompletionsPOST https://llm.bytespike.ai/v1/chat/completions
OpenAI ResponsesPOST https://llm.bytespike.ai/v1/responses
The Responses API is the newer agent-style protocol — Codex requires it. Chat Completions is the ubiquitous default; most code will use this one.

Quickstart

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-5-4",
    "reasoning_effort": "medium",
    "messages": [
      { "role": "user", "content": "Hello, ByteSpike." }
    ]
  }'

Capabilities

CapabilitySupported
Chat Completions
Responses API
Streaming (SSE)
Vision (image input)
Tool use (function calling)✅ parallel
JSON mode
Structured outputs (json_schema)
Reasoning effort (low/medium/high)
Web search
Context window128K tokens

When to use

  • Production workhorse — when you need the GPT-5 quality envelope with vision + tools + reasoning, at a mid-tier price.
  • Structured outputs. response_format: { type: "json_schema" } returns schema-validated JSON; ByteSpike does not modify the schema.
  • Codex-style clients. Hit the Responses API instead of Chat Completions for the structured-reasoning shape.
  • Fresh-fact queries. Add tools: [{ "type": "web_search" }].
When not to use:
  • Flagship reasoning — use gpt-5-5, the current GPT-5 family flagship.
  • High-volume classification — gpt-5-4-mini is cheaper.

Next