nano-banana-v2 is Google’s lightweight image model with image-to-image edits layered onto the same protocol surface and roughly the same cost profile as nano-banana. The differentiator is the optional source_image field: when present, the model treats it as a reference (style transfer, mask edit, background swap) rather than generating from text alone.
Pricing: $0.022 / image — see the rate card. Failures don’t bill; image-to-image runs at the same per-image rate as text-to-image.
Protocols
| Protocol | Path |
|---|---|
| OpenAI Images | POST https://llm.bytespike.ai/v1/images/generations |
Quickstart
generate_image tool passes source_image through automatically when you’ve attached one in the chat.
Capabilities
| Capability | Supported |
|---|---|
| Text-to-image | ✅ |
Image-to-image (with source_image) | ✅ |
n ≥ 2 batch generation | ✅ |
size 1024×1024 / 1024×1792 / 1792×1024 | ✅ |
response_format url / b64_json | ✅ |
| Modality | image |
| Capability bucket | image_generate |
When to use
- DOSIA “edit this image” requests — main brain attaches the chat image and routes here when permission is granted.
- Background swaps, style transfers, mask edits without the cost of
gpt-image-2. - Cost-sensitive image-to-image pipelines — volume programmatic edits.
- Pure text-to-image where image-to-image isn’t needed —
nano-bananagets you the same baseline at the same tier. - Top-fidelity typography or marketing hero —
gpt-image-2.
Next
nano-banana— text-only siblinggpt-image-2— OpenAI flagship with image-to-image- Multimodal endpoints — overview