Skip to main content
Vendor: OpenAI Model ID: gpt-5-4-nano Capability: 128K context · tool use · vision · streaming Pricing: per-token, nano tier (live rate) GPT-5.4-nano is the speed floor of the 5.4 wave. Same nano-tier price as GPT-5-nano with 5.4’s tighter output. Right pick for routing, classification, and head-of-pipeline triage where every millisecond compounds across many calls.

Request

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-5-4-nano",
    "messages": [{"role": "user", "content": "Is this email spam? yes / no. Email: ..."}]
  }'

Body parameters

FieldTypeRequiredDefaultNotes
modelstringyesgpt-5-4-nano
messagesarrayyes
max_tokensintegernomodel maxMax: 8192.
toolsarraynoFunction calling supported.
response_formatobjectnoJSON mode + structured output.
streambooleannofalseSSE streaming.

Response

{
  "id": "chatcmpl-…",
  "object": "chat.completion",
  "model": "gpt-5-4-nano",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "yes"}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 142, "completion_tokens": 1, "total_tokens": 143}
}

Code examples

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{"model": "gpt-5-4-nano", "messages": [{"role": "user", "content": "Is this email spam?"}]}'

Streaming + caching

"stream": true for SSE. Automatic prompt caching.

Errors

CodeTriggerBilled?
400 / 401 / 402 / 422 / 429StandardNo
5xxUpstreamNo (auto-retry)

When to use

  • Routing / classification at the head of an agent pipeline.
  • For older 5-series nano, see GPT-5-nano.
  • For more capability, see GPT-5.4-mini.

Limits

LimitValue
Context window128K tokens
Max output8192 tokens
Supports tool useYes
Supports visionYes
Supports streamingYes
Supports prompt cachingAutomatic