gpt-5-2
Capability: 128K context · tool use · vision · streaming · structured output
Pricing: per-token, standard tier (live rate)
GPT-5.2 was the first refinement step on the original GPT-5 — same
shape, more reliable structured output. It’s still a competent
production model for any team that’s already benchmark-validated
against this version. For new work, prefer
GPT-5.4 or
GPT-5.5.
Request
Body parameters
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | yes | — | gpt-5-2 |
messages | array | yes | — | — |
max_tokens | integer | no | model max | Max: 16384. |
temperature | number | no | 1.0 | — |
tools | array | no | — | Function calling supported (parallel). |
response_format | object | no | — | JSON mode + structured output. |
stream | boolean | no | false | SSE streaming. |
Response
Code examples
Streaming + caching
"stream": true for SSE. Automatic prompt caching on stable prefixes,
discounted rate per pricing table.
Errors
| Code | Trigger | Billed? |
|---|---|---|
| 400 / 401 / 402 / 422 / 429 | Standard | No |
| 5xx | Upstream issue | No (auto-retry) |
When to use
- Existing code validated against this exact version.
- For new work, prefer GPT-5.4 or GPT-5.5.
- For lower cost, see GPT-5-mini.
Limits
| Limit | Value |
|---|---|
| Context window | 128K tokens |
| Max output | 16384 tokens |
| Supports tool use | Yes (parallel) |
| Supports vision | Yes |
| Supports streaming | Yes |
| Supports prompt caching | Automatic |