gpt-image-2 is OpenAI’s flagship image model. Text-to-image and image-to-image (mask-driven edits) live on a single endpoint, billed per image generated. Strong on layout fidelity, in-image typography, and product/scene composition — the place to reach for when “looks like a real OpenAI render” matters more than the lowest per-image cost.
Pricing: 0.08 per generated image; failures don’t bill.
Protocols
| Protocol | Path |
|---|---|
| OpenAI Images | POST https://llm.bytespike.ai/v1/images/generations |
Quickstart
Capabilities
| Capability | Supported |
|---|---|
| Text-to-image | ✅ |
Image-to-image (with source_image) | ✅ |
| Mask-driven edits | ✅ |
n ≥ 2 batch generation | ✅ |
size 1024×1024 / 1024×1792 / 1792×1024 | ✅ |
quality standard / hd | ✅ |
response_format url / b64_json | ✅ |
| Modality | image |
| Capability bucket | image_generate |
When to use
- Marketing creative — hero images, social cards, anywhere typography-in-image matters.
- Product mockups — fidelity on materials, lighting, and small print holds up better than most domestic alternatives.
- DOSIA
generate_imagetool — the main brain will resolve “draw me an X” to this model by default when permission is granted.
- High-volume or budget-sensitive work —
nano-bananais materially cheaper for the same shape. - Photorealistic Chinese aesthetic / poster work — try
seedream-4orseedream-4.5.
Next
nano-banana— cheaper Google text-to-imageseedream-4— ByteDance flagship- Multimodal endpoints — overview of image / video / audio / embedding surfaces