deepseek-v4-pro is DeepSeek’s reasoning flagship and the most cost-effective high-capability model on ByteSpike. A fraction of the cost of gpt-5-4 at comparable quality on most benchmarks, with the same dual-protocol surface that DeepSeek exposes natively.
Pricing: 0.87 / 1M output, $0.004 / 1M cache read — see the rate card.
DeepSeek V4 is a distinct series from the older V3.2, at different price points. For V3.2 see deepseek-v3-2.
Protocols
| Protocol | Path |
|---|---|
| Anthropic Messages | POST https://llm.bytespike.ai/v1/messages |
| OpenAI Chat Completions | POST https://llm.bytespike.ai/v1/chat/completions |
Quickstart
Capabilities
| Capability | Supported |
|---|---|
| Chat Completions | ✅ |
| Anthropic Messages | ✅ |
| Streaming (SSE) | ✅ |
| Tool use (function calling) | ✅ parallel |
| JSON mode | ✅ |
| Reasoning chain | ✅ (reasoning_content field / thinking block) |
| Vision (HTTP API) | ❌ |
| Context window | 64K tokens |
- On the OpenAI endpoint:
reasoning_contentfield on the choice. - On the Anthropic endpoint: a
thinkingblock ahead of thetextblock incontent[].
When to use
- Cost-sensitive reasoning — when you’d reach for
gpt-5-4but want a fraction of the cost. - Agents on a budget —
tool_useblocks pass through transparently via the Anthropic endpoint; DOSIA Agent uses this model heavily. - DOSIA Agent mode — must use the Anthropic Messages endpoint.
- Vision-required tasks — not on HTTP API today.
- Web search — DeepSeek does not expose a grounding tool.
- Long context — 64K is the ceiling here; go to
claude-opus-4-8(200K) orgemini-3-1-pro(1M).
Next
- deepseek-v4-flash — small-mid
- deepseek-v3.2 — older V3 series