gemini-3-1-pro is Google’s current flagship Gemini model. Native vision input on both images and video, 1M-token context, parallel tool use, JSON mode. On the gateway, it’s the default Google-stack landing spot for DOSIA’s vision tools — when the main brain calls analyze_image or (once the endpoint ships) analyze_video with a Google preference, this is where the request lands.
Pricing: 12.00 / 1M output, $0.20 / 1M cache read — see the rate card. Failures don’t bill.
Protocols
| Protocol | Path |
|---|---|
| OpenAI Chat Completions (shim) | POST https://llm.bytespike.ai/v1/chat/completions |
Quickstart
gpt-5-5 — swap the model field.
Capabilities
| Capability | Supported |
|---|---|
| Chat Completions | ✅ |
| Streaming (SSE) | ✅ |
| Vision (image input) | ✅ |
| Vision on video input | ✅ (native, no separate analyze_video endpoint needed) |
| Tools / function calling | ✅ parallel |
| JSON mode | ✅ |
| Context window | 1M tokens |
| Modality | chat + vision + video-vision |
| Capability bucket | vision + external_chat |
When to use
- Long-context vision — analyzing a slide deck, a multi-page screenshot, a video that won’t fit in shorter context windows.
- Google-stack default — when the main brain prefers Google for cost or compliance, this is the vision endpoint.
- DOSIA
analyze_image/analyze_video— main brain resolves to this when permissioned and the prompt prefers Gemini.
- Pure text reasoning (no vision needed) —
gpt-5-5orclaude-sonnet-4-6per stack preference. - Cost-sensitive vision at volume —
gemini-2-5-flashgives basic vision at a fraction of the cost.
Next
gemini-3-5-flash— fast Gemini 3 tiergemini-2-5-flash— cost-sensitive Gemini tiergpt-5-5— OpenAI alternative- DOSIA MCP integration — vision-tool surface