Skip to main content
Vendor: Moonshot Model ID: kimi-k2-5 Capability: 256K context · tool use · streaming · structured output · CJK-native Pricing: per-token, mid tier (live rate) Kimi K2.5 is Moonshot’s prior flagship. Native Chinese / Japanese / Korean prompt understanding, and the 256K context window made it the default for long-document extraction in the Chinese market. Still production-capable; for new work, Kimi K2.6 is the recommended starting point.

Request

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "kimi-k2-5",
    "messages": [{"role": "user", "content": "提取这篇合同里的关键条款。"}]
  }'

Body parameters

FieldTypeRequiredDefaultNotes
modelstringyeskimi-k2-5
messagesarrayyesCJK accepted natively.
max_tokensintegernomodel maxMax: 8192.
toolsarraynoFunction calling supported.
response_formatobjectnoJSON mode.
streambooleannofalseSSE streaming.

Response

{
  "id": "chatcmpl-…",
  "object": "chat.completion",
  "model": "kimi-k2-5",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "..."}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 84210, "completion_tokens": 312, "total_tokens": 84522}
}

Code examples

curl https://llm.bytespike.ai/v1/chat/completions \
  -H "Authorization: Bearer $BYTESPIKE_API_KEY" \
  -H "content-type: application/json" \
  -d '{"model": "kimi-k2-5", "messages": [{"role": "user", "content": "提取关键条款"}]}'

Streaming + caching

"stream": true for SSE. Automatic prompt caching.

Errors

CodeTriggerBilled?
400 / 401 / 402 / 422 / 429StandardNo
5xxUpstreamNo (auto-retry)

When to use

  • Long-document extraction in Chinese / Japanese / Korean.
  • Existing code validated against this exact version.
  • For new work, prefer Kimi K2.6.
  • For Western-market alternatives at similar context, see Gemini 3.1 Pro (1M).

Limits

LimitValue
Context window256K tokens
Max output8192 tokens
Supports tool useYes
Supports visionNo
Supports streamingYes
Supports prompt cachingAutomatic