Skip to main content
Vendor: OpenAI Model ID: gpt-4o-image Capability: 1024² – 2048² · multi-turn image generation · in-conversation editing Pricing: per image, conversational tier (live rate) GPT-4o Image is the conversational image generator — instead of a one-shot /images/generations call, you send a chat completions request and the model returns image content inside the response. This matters when the workflow is multi-turn: “generate this”, “now make the background blue”, “now add a dog”. The conversation memory preserves the underlying image so subsequent turns are edits, not fresh generations.

Request

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-4o-image",
    "messages": [
      {"role": "user", "content": "Generate an image of a koi pond at dusk."}
    ],
    "image_output": {"size": "1024x1024", "quality": "high"}
  }'

Body parameters

FieldTypeRequiredDefaultNotes
modelstringyesgpt-4o-image
messagesarrayyesStandard chat shape. The model returns images as image_url content blocks.
image_output.sizestringno1024x1024Supported: 1024x1024, 1024x1536, 1536x1024, 2048x2048.
image_output.qualitystringno"medium""low" / "medium" / "high".
image_output.nintegerno11–2 images per turn.
toolsarraynoFunction calling supported alongside image output.
streambooleannofalseStreaming partial-image deltas supported.

Response

{
  "id": "chatcmpl-…",
  "object": "chat.completion",
  "model": "gpt-4o-image",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": [
        {"type": "text", "text": "Here's the koi pond at dusk:"},
        {"type": "image_url", "image_url": {"url": "https://cdn.bytespike.ai/img/..."}}
      ]
    },
    "finish_reason": "stop"
  }],
  "usage": {"prompt_tokens": 18, "completion_tokens": 14, "image_tokens": 1024, "total_tokens": 1056}
}
URLs pre-signed, 24h expiry. Pass the conversation back in messages to edit the same image on subsequent turns.

Code examples

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-4o-image",
    "messages": [{"role": "user", "content": "Generate an image of a koi pond at dusk."}],
    "image_output": {"size": "1024x1024", "quality": "high"}
  }'

Multi-turn edit workflow

Pass the assistant’s response (image url and all) back in the next messages array. The model treats the image in conversation context as the canvas to edit:
# Turn 1 — generate
turn1 = client.chat.completions.create(
    model="gpt-4o-image",
    messages=[{"role": "user", "content": "Generate a koi pond at dusk."}],
    extra_body={"image_output": {"size": "1024x1024"}},
)

# Turn 2 — edit
turn2 = client.chat.completions.create(
    model="gpt-4o-image",
    messages=[
        {"role": "user", "content": "Generate a koi pond at dusk."},
        turn1.choices[0].message,
        {"role": "user", "content": "Make it dawn instead, with mist on the water."},
    ],
    extra_body={"image_output": {"size": "1024x1024"}},
)
Each turn bills as a separate image generation.

Errors

CodeTriggerBilled?
400 / 401 / 402 / 403StandardNo
451Prompt blocked by upstream safetyNo
5xxUpstream issueNo (auto-retry)

When to use

  • Multi-turn image editing where conversation context matters.
  • Workflows that mix text reasoning with image output (the model can describe what it generated, ask clarifying questions).
  • For one-shot / batch image generation, see GPT-Image 2.
  • For pure photorealism, see Nano Banana Pro or Nano Banana 2.

Limits

LimitValue
Max output resolution2048×2048
Max images per turn (n)2
Multi-turn editingYes
Supports quality modifierYes
Sync?Yes (≤30s typical)
Avg latency for 1024²10-16s