gpt-image-2
Capability: 1024² – 4096² · photorealism · prompt-following · text-in-image
Pricing: per image, photoreal tier (live rate)
GPT-Image 2 is OpenAI’s photoreal generator. The thing that
distinguishes it from Nano Banana is prompt following — when your
brief reads “the kettle on the LEFT, mug on the RIGHT, steam rising
from both”, GPT-Image 2 respects spatial relationships better than
most photoreal alternatives. For pure material fidelity, Nano Banana
Pro and 2 still have an edge; for instruction adherence, this is
usually the right call.
Request
Body parameters
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | yes | — | gpt-image-2 |
prompt | string | yes | — | English-tuned. |
size | string | no | 1024x1024 | Supported: 1024x1024, 1024x1536, 1536x1024, 2048x2048, 4096x4096. |
n | integer | no | 1 | 1–4 images. |
quality | string | no | "medium" | "low" / "medium" / "high" / "auto". |
style | string | no | "natural" | "natural" / "vivid". |
output_format | string | no | "png" | "png" / "webp" / "jpeg". |
output_compression | integer | no | — | 0–100, only for webp / jpeg. |
Response
Code examples
Errors
| Code | Trigger | Billed? |
|---|---|---|
| 400 / 401 / 402 / 403 | Standard | No |
| 451 | Prompt blocked by upstream safety | No |
| 5xx | Upstream issue | No (auto-retry) |
When to use
- Prompts that specify spatial relationships, counts, or composition.
- Briefs that mix text descriptions with explicit constraints.
- For more material / lighting fidelity, see Nano Banana 2 or Nano Banana Pro.
- For the compliance-tier variant, see GPT-Image 2 Official.
- For aesthetic / illustration, see Seedream V4.5.
Limits
| Limit | Value |
|---|---|
| Max output resolution | 4096×4096 |
Max images per call (n) | 4 |
| Supports init image (img2img) | No |
Supports quality modifier | Yes |
Supports style modifier | Yes |
| Sync? | Yes (≤30s typical, longer at high quality) |
| Avg latency for 1024² high | 12-18s |