Gemini 3 Flash - ByteSpike

厂商： Google Model ID： gemini-3-flash 能力： 200K context · vision · tool use · streaming · structured output 价格： 按 token，flash 档（实时价格） Gemini 3 Flash 是 Gemini 3 家族的小快版。当你的输入大多是文本、偶尔夹图像、你宁愿多发几次便宜调用而不是少发几次贵调用时就用它。200K context 也让它适合长文档分类等不需要旗舰推理的任务。

请求

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "Classify the topic of this article."}]
  }'

Body 参数

字段	类型	是否必填	默认	说明
`model`	string	是	—	`gemini-3-flash`
`messages`	array	是	—	OpenAI chat 形状，视觉用 `image_url` 块。
`max_tokens`	integer	否	model max	最大：8192。
`temperature`	number	否	1.0	范围 0.0–2.0。
`tools`	array	否	—	支持 function calling。
`response_format`	object	否	—	JSON mode + 结构化输出。
`stream`	boolean	否	false	SSE 流式。

响应

{
  "id": "chatcmpl-…",
  "object": "chat.completion",
  "model": "gemini-3-flash",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "technology / startups"}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 412, "completion_tokens": 4, "total_tokens": 416}
}

代码示例

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{"model": "gemini-3-flash", "messages": [{"role": "user", "content": "Classify the topic."}]}'

流式 + 缓存

"stream": true 走 SSE。稳定前缀自动 prompt caching。

错误

Code	触发	是否计费
400 / 401 / 402 / 422 / 429	标准	否
5xx	上游	否（自动重试）

何时使用

长文档输入的高吞吐分类 / 路由（200K context）。
flash 档价位的视觉原生任务。
深度推理见 Gemini 3.1 Pro。
OpenAI 同档对应物见 GPT-5.4 mini。

限制

限制	值
Context window	200K tokens
Max output	8192 tokens
支持 tool use	是
支持 vision	是
支持 streaming	是
支持 prompt caching	自动

Claude Opus 4.8 Gemini 3.1 Pro

​请求

​Body 参数

​响应

​代码示例

​流式 + 缓存

​错误

​何时使用

​限制

请求

Body 参数

响应

代码示例

流式 + 缓存

错误

何时使用

限制