POST /v1/messages. That’s not negotiable: agent frameworks worth using need tool_use blocks, cache_control blocks, and thinking blocks to pass through end-to-end, and the Anthropic shape is the only protocol that exposes all three as first-class concepts.
DOSIA Chat mode is a different story — it runs on OpenAI Chat Completions and works across every chat-shape model on ByteSpike. This page is about Agent mode specifically: what works today, what’s planned, what’s worth knowing per model.
The protocol surface
A DOSIA Agent request lands onhttps://llm.bytespike.ai/v1/messages with the standard Anthropic shape:
tool_use content blocks come back in the response; DOSIA executes the tool; the next turn sends a tool_result block back. Standard Anthropic Messages agent loop.
The cache_control: { type: "ephemeral" } markers DOSIA puts on its system prompt and on stable context (workspace tree, recent edits) flow through to whichever model serves the request — see the cache_control note per model below.
Which models Agent mode can target
Agent mode benefits from the same routing as every other request — but the protocol constrains the eligible set to models that support an Anthropic Messages surface. Today’s eligible set:| Model family | Status | Notes |
|---|---|---|
claude-haiku-4-5 / sonnet-4-5 / sonnet-4-6 / opus-4-7 / opus-4-8 | ✅ live | Native Anthropic shape; everything works |
deepseek-v4-pro / deepseek-v4-flash | ✅ live | See DeepSeek caveats below |
kimi-k2-6 (anthropic-compat alias) | ⏳ planned | Anthropic-compat surface in flight |
| GLM (anthropic-compat alias) | ⏳ planned | Same as above |
| MiniMax (anthropic-compat alias) | ⏳ planned | Same |
Picking a model for Agent
A short opinionated decision aid for DOSIA Agent users:| Scenario | Pick | Why |
|---|---|---|
| Default agent work, broad capability | claude-sonnet-4-6 | Tool use + thinking + cache_control + web_search all together |
| Codebase-scale agent (full repo into context) | claude-opus-4-8 | 200K context window, current Anthropic flagship |
| Cost-optimized agent at production scale | claude-haiku-4-5 | Tool use included; thinking is not, but most agent loops don’t need extended thinking |
| Chinese-language workloads, cost-sensitive | deepseek-v4-pro | ~10× cheaper than Sonnet; reasoning chain available |
| Chinese-language at the cheapest tier | deepseek-v4-flash | Haiku-class price; subset of Pro’s capabilities |
cache_control per model
cache_control: { type: "ephemeral" } markers behave differently per model:
- Claude models — full first-class support. Cache write is 1.25× input; cache read is ~10% of input. TTL refreshes on each hit.
- DeepSeek models —
cache_controlis not yet supported. The marker passes through unchanged and is ignored. No caching benefit, but no error either. - Kimi / GLM / MiniMax — same as DeepSeek today. The anthropic-compat aliases accept the shape but caching is not yet wired through.
cache_control markers in your DOSIA Agent system prompt regardless of which model you’re targeting — they’re free when ignored and they activate automatically once a model gains support.
DeepSeek caveats
DOSIA Agent againstdeepseek-v4-pro / deepseek-v4-flash is fully supported today, with three caveats:
- No
cache_control. As above. - Vision is not available on the DeepSeek API. DeepSeek’s models don’t accept image input over the API (either the OpenAI or anthropic-compat shape). If your Agent expects to send
imagecontent blocks, route those requests to a Claude orgpt-5-4model; keep the rest on DeepSeek. thinkingblocks appear asreasoning_contenton the OpenAI endpoint but as properthinkingblocks on the anthropic-compat endpoint. DOSIA Agent uses the anthropic-compat path so you get the native shape; the difference matters only if you switch protocols.
Failure modes
What can go wrong on a DOSIA Agent → ByteSpike call, and what each failure looks like:| Symptom | Likely cause | Where to look |
|---|---|---|
404 on /v1/messages | Model name not eligible for Agent (e.g. you sent a GPT model) | Send a Claude / DeepSeek / future-supported model. See the eligibility table above. |
| 422 with “tool_use not supported” | Model doesn’t expose an anthropic-compat surface yet | Switch the request to Claude or DeepSeek; check the coverage matrix |
| 5xx | The model is temporarily unavailable | ByteSpike auto-retries within your key’s group. If everything in the group is unavailable, the gateway surfaces the error. |
Mid-stream error event | The response was aborted mid-stream | Zero credits charged (see credits and billing); DOSIA will surface to the user as a streaming-failure toast |
Configuring DOSIA Cloud Enterprise
For DOSIA Cloud Enterprise admins building permission templates: the Global edition and China edition presets pre-select the right Agent default per region. See the models index DOSIA recommended paths section for the table mapping each preset to its Agent + Chat defaults.Next
- Endpoint types — the full protocol map
- Models index — per-model docs including the DOSIA recommended-paths table