Skip to main content
Google’s gemini-cli (and the family of CLIs / IDE extensions built on top of it) speaks the Gemini Native protocol — query-param auth (?key=...) against /v1beta/models/{model}:generateContent. ByteSpike serves that protocol verbatim at llm.bytespike.ai/v1beta.

Prerequisites

  • A ByteSpike account + a key bound to the gemini-default group (or any group that serves Gemini models). See Register.
  • Gemini CLI installed:
    npm install -g @google/generative-ai-cli
    # or whichever Gemini CLI your team uses
    

Configure

Gemini CLIs typically read GEMINI_API_KEY and let you override the base URL via env or flag.
export GEMINI_API_KEY="sk-byts-..."
export GEMINI_BASE_URL="https://llm.bytespike.ai/v1beta"
For the official Google CLI that doesn’t expose a base-URL flag, set the variable per its docs (some versions read GOOGLE_API_BASE_URL); or use the raw curl form (next section) inside a wrapper script.

Verify

curl "https://llm.bytespike.ai/v1beta/models/gemini-3.1-pro:generateContent?key=$GEMINI_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "say hi"}]}]
  }'
Expect a 200 with a candidates[0].content.parts[0].text field, plus the standard X-Quota-Remaining-Credits header.

Switching models

The model name lives in the URL path:
/v1beta/models/gemini-3-1-pro:generateContent
/v1beta/models/gemini-3-5-flash:generateContent
/v1beta/models/gemini-2-5-flash:generateContent
Any model id from your key’s group works. See /v1beta reference for the full request shape.

Streaming

Switch the method suffix:
curl "https://llm.bytespike.ai/v1beta/models/gemini-3.1-pro:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "explain SSE briefly"}]}]
  }'
SSE stream matches Google’s native format — data: {chunk}\n\n blocks terminated by a final [DONE] marker.

SDKs

Both Google’s official Generative AI SDK and most third-party Gemini clients accept a baseUrl override at client construction:
import google.generativeai as genai
genai.configure(
    api_key="sk-byts-...",
    transport="rest",
    client_options={"api_endpoint": "llm.bytespike.ai"}
)
model = genai.GenerativeModel("gemini-3-1-pro")
print(model.generate_content("hello").text)
import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("sk-byts-...", {
  baseUrl: "https://llm.bytespike.ai",
});
const model = genAI.getGenerativeModel({ model: "gemini-3-1-pro" });
const result = await model.generateContent("hello");
console.log(result.response.text());

Image + video models via Gemini stack

Veo (Google’s video model) ships under the Gemini API surface but the long-running shape uses ByteSpike’s async tasks API instead:
curl https://llm.bytespike.ai/v1/tasks/submit \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -d '{
    "model": "veo-3-1",
    "params": {"prompt": "...", "duration_seconds": 8}
  }'
See POST /tasks/submit and the Veo 3.1 model page.

Troubleshooting

SymptomCauseFix
401 API key not validWrong key or missing query paramVerify ?key=sk-byts-... is present and correct
403 PERMISSION_DENIEDModel not in key’s groupSwitch to gemini-default group or pick another model
404 NOT_FOUND (model)Model id typo — ByteSpike uses the dashed form (gemini-3-1-pro), not Google’s dotted form (gemini-3.1-pro)Use the slugs from /api/v1/me/available-models
Stream cuts after first chunkSome CLIs default to non-streaming endpoint; check the :streamGenerateContent suffixSwitch to streamGenerateContent

Next

/v1beta reference

Full request / response / streaming protocol.

Gemini models

Models, capabilities, pricing.

Claude Code CLI

The Anthropic-native equivalent.

Cursor IDE

Editor-level Gemini integration.