Skip to main content
DOSIA Agent mode runs on the Anthropic Messages protocolPOST /v1/messages. That’s not negotiable: agent frameworks worth using need tool_use blocks, cache_control blocks, and thinking blocks to pass through end-to-end, and the Anthropic shape is the only protocol that exposes all three as first-class concepts. DOSIA Chat mode is a different story — it runs on OpenAI Chat Completions and works across every chat-shape model on ByteSpike. This page is about Agent mode specifically: what works today, what’s planned, what’s worth knowing per model.

The protocol surface

A DOSIA Agent request lands on https://llm.bytespike.ai/v1/messages with the standard Anthropic shape:
{
  "model": "claude-sonnet-4-6",
  "max_tokens": 2048,
  "tools": [
    {
      "name": "search_files",
      "description": "Search the local workspace for files.",
      "input_schema": { ... }
    }
  ],
  "messages": [
    { "role": "user", "content": "Find every place we set the locale cookie." }
  ]
}
tool_use content blocks come back in the response; DOSIA executes the tool; the next turn sends a tool_result block back. Standard Anthropic Messages agent loop. The cache_control: { type: "ephemeral" } markers DOSIA puts on its system prompt and on stable context (workspace tree, recent edits) flow through to whichever model serves the request — see the cache_control note per model below.

Which models Agent mode can target

Agent mode benefits from the same routing as every other request — but the protocol constrains the eligible set to models that support an Anthropic Messages surface. Today’s eligible set:
Model familyStatusNotes
claude-haiku-4-5 / sonnet-4-5 / sonnet-4-6 / opus-4-7 / opus-4-8✅ liveNative Anthropic shape; everything works
deepseek-v4-pro / deepseek-v4-flash✅ liveSee DeepSeek caveats below
kimi-k2-6 (anthropic-compat alias)⏳ plannedAnthropic-compat surface in flight
GLM (anthropic-compat alias)⏳ plannedSame as above
MiniMax (anthropic-compat alias)⏳ plannedSame
GPT and Gemini do not appear here — they don’t support an Anthropic Messages surface, and ByteSpike does not synthesize one (the protocol-mapping cost in fidelity is not worth it for agent use). For GPT or Gemini, DOSIA Chat mode is the route — see endpoint types.

Picking a model for Agent

A short opinionated decision aid for DOSIA Agent users:
ScenarioPickWhy
Default agent work, broad capabilityclaude-sonnet-4-6Tool use + thinking + cache_control + web_search all together
Codebase-scale agent (full repo into context)claude-opus-4-8200K context window, current Anthropic flagship
Cost-optimized agent at production scaleclaude-haiku-4-5Tool use included; thinking is not, but most agent loops don’t need extended thinking
Chinese-language workloads, cost-sensitivedeepseek-v4-pro~10× cheaper than Sonnet; reasoning chain available
Chinese-language at the cheapest tierdeepseek-v4-flashHaiku-class price; subset of Pro’s capabilities
The recommended-paths table in models/index covers the same ground from the model-first angle.

cache_control per model

cache_control: { type: "ephemeral" } markers behave differently per model:
  • Claude models — full first-class support. Cache write is 1.25× input; cache read is ~10% of input. TTL refreshes on each hit.
  • DeepSeek modelscache_control is not yet supported. The marker passes through unchanged and is ignored. No caching benefit, but no error either.
  • Kimi / GLM / MiniMax — same as DeepSeek today. The anthropic-compat aliases accept the shape but caching is not yet wired through.
This is a known limitation. The current recommendation: keep cache_control markers in your DOSIA Agent system prompt regardless of which model you’re targeting — they’re free when ignored and they activate automatically once a model gains support.

DeepSeek caveats

DOSIA Agent against deepseek-v4-pro / deepseek-v4-flash is fully supported today, with three caveats:
  1. No cache_control. As above.
  2. Vision is not available on the DeepSeek API. DeepSeek’s models don’t accept image input over the API (either the OpenAI or anthropic-compat shape). If your Agent expects to send image content blocks, route those requests to a Claude or gpt-5-4 model; keep the rest on DeepSeek.
  3. thinking blocks appear as reasoning_content on the OpenAI endpoint but as proper thinking blocks on the anthropic-compat endpoint. DOSIA Agent uses the anthropic-compat path so you get the native shape; the difference matters only if you switch protocols.

Failure modes

What can go wrong on a DOSIA Agent → ByteSpike call, and what each failure looks like:
SymptomLikely causeWhere to look
404 on /v1/messagesModel name not eligible for Agent (e.g. you sent a GPT model)Send a Claude / DeepSeek / future-supported model. See the eligibility table above.
422 with “tool_use not supported”Model doesn’t expose an anthropic-compat surface yetSwitch the request to Claude or DeepSeek; check the coverage matrix
5xxThe model is temporarily unavailableByteSpike auto-retries within your key’s group. If everything in the group is unavailable, the gateway surfaces the error.
Mid-stream error eventThe response was aborted mid-streamZero credits charged (see credits and billing); DOSIA will surface to the user as a streaming-failure toast
For the broader failure-billing policy, see credits and billing. For retries and idempotency, see error handling.

Configuring DOSIA Cloud Enterprise

For DOSIA Cloud Enterprise admins building permission templates: the Global edition and China edition presets pre-select the right Agent default per region. See the models index DOSIA recommended paths section for the table mapping each preset to its Agent + Chat defaults.

Next