Skip to main content
gemini-3-1-pro is Google’s current flagship Gemini model. Native vision input on both images and video, 1M-token context, parallel tool use, JSON mode. On the gateway, it’s the default Google-stack landing spot for DOSIA’s vision tools — when the main brain calls analyze_image or (once the endpoint ships) analyze_video with a Google preference, this is where the request lands. Pricing: 2.00/1Minput,2.00 / 1M input, 12.00 / 1M output, $0.20 / 1M cache read — see the rate card. Failures don’t bill.

Protocols

ProtocolPath
OpenAI Chat Completions (shim)POST https://llm.bytespike.ai/v1/chat/completions

Quickstart

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gemini-3-1-pro",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "How many people are in this image?" },
          { "type": "image_url", "image_url": { "url": "https://example.com/scene.jpg" } }
        ]
      }
    ]
  }'
Python and TypeScript invocations match gpt-5-5 — swap the model field.

Capabilities

CapabilitySupported
Chat Completions
Streaming (SSE)
Vision (image input)
Vision on video input✅ (native, no separate analyze_video endpoint needed)
Tools / function calling✅ parallel
JSON mode
Context window1M tokens
Modalitychat + vision + video-vision
Capability bucketvision + external_chat

When to use

  • Long-context vision — analyzing a slide deck, a multi-page screenshot, a video that won’t fit in shorter context windows.
  • Google-stack default — when the main brain prefers Google for cost or compliance, this is the vision endpoint.
  • DOSIA analyze_image / analyze_video — main brain resolves to this when permissioned and the prompt prefers Gemini.
When not to use:

Next