gpt-5-2 sits between gpt-5-4-mini and gpt-5-4. It’s the cheapest GPT-5 variant with built-in web search and is OpenAI’s preferred model for short-to-medium agent loops that need to look something up before answering.
If you don’t need browsing, gpt-5-4-mini is cheaper for the same quality on closed-book tasks. If you need flagship reasoning headroom, jump to gpt-5-4.
Pricing: 14.00 / 1M output, $0.175 / 1M cache read — see the rate card.
Protocols
| Protocol | Path |
|---|---|
| OpenAI Chat Completions | POST https://llm.bytespike.ai/v1/chat/completions |
| OpenAI Responses | POST https://llm.bytespike.ai/v1/responses |
Quickstart
Capabilities
| Capability | Supported |
|---|---|
| Chat Completions | ✅ |
| Responses API | ✅ |
| Streaming (SSE) | ✅ |
| Vision | ✅ |
| Tool use (function calling) | ✅ parallel |
| JSON mode | ✅ |
| Structured outputs (json_schema) | ✅ |
| Reasoning effort | ✅ (low / medium / high) |
| Web search (built-in tool) | ✅ |
| Context window | 128K tokens |
When to use
- Short agent loops with one or two browsing hops — pricing the web, looking up live data, fact-checking a draft. Cheaper than
gpt-5-4for the same kind of work. - Vision + lightweight reasoning — captioning, OCR, “what’s on this screenshot” questions where you don’t need flagship-level analysis.
- Mid-volume RAG — when retrieval is doing the heavy lifting and the LLM is summarising / synthesising.
- Hard reasoning that wants
reasoning_effort: "high"continuously — pay the premium forgpt-5-4orgpt-5-5. - High-volume classification / routing —
gpt-5-4-miniis the right cost tier.
Next
- gpt-5-4 — production workhorse, no built-in web search but more reasoning
- gpt-5-4-mini — cheaper, closed-book
- Endpoint types — Chat Completions vs Responses