gpt-5-4-nano is the smallest tier of the GPT-5.4 family — the cheapest way to get GPT-flavored reasoning on the gateway. The right pick for very-high-volume traffic (classification, routing, short structured extraction) where unit cost dominates and the task fits a small prompt.
Pricing: 1.25 / 1M output, $0.02 / 1M cache read — see the rate card.
Protocols
| Protocol | Path |
|---|---|
| OpenAI Chat Completions | POST https://llm.bytespike.ai/v1/chat/completions |
| OpenAI Responses | POST https://llm.bytespike.ai/v1/responses |
Quickstart
Capabilities
| Capability | Supported |
|---|---|
| Chat Completions | ✅ |
| Responses API | ✅ |
| Streaming (SSE) | ✅ |
| Vision | ✅ |
| Tool use | ✅ parallel |
| JSON mode | ✅ |
| Structured outputs | ✅ |
| Reasoning effort | ✅ (low / medium) |
| Web search | — |
| Context window | 128K tokens |
When to use
- Very-high-volume classification, routing, structured extraction — the cheapest GPT tier when unit cost dominates.
- Short-prompt agent loops — tool-use chains where each step is small and the gate is throughput.
- Cost-floor fallback — when the task fits nano, prefer it over
gpt-5-4-minifor the savings.
- Hard reasoning that benefits from
reasoning_effort: "high"— go togpt-5-4orgpt-5-5. - Web search needed — only
gpt-5-2and up have it.
Next
- gpt-5-4-mini — small-mid of the GPT-5.4 family
- gpt-5-4 — production workhorse
- gpt-5-5 — current GPT-5 flagship