glm-5-1
Capability: 128K context · vision · tool use · streaming · structured output · CJK-native
Pricing: per-token, mid tier (live rate)
GLM-5.1 is the refinement step on GLM-5 — same context window with
added vision support, tighter tool-call argument generation, and a
measurable quality bump on Chinese-market code generation. Default
starting point for new Chinese-market projects on the gateway.
Request
Body parameters
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | yes | — | glm-5-1 |
messages | array | yes | — | CJK accepted natively. Vision via image_url blocks. |
max_tokens | integer | no | model max | Max: 16384. |
tools | array | no | — | Function calling supported (parallel). |
response_format | object | no | — | JSON / structured output. |
stream | boolean | no | false | SSE streaming. |
Response
Code examples
Streaming + caching
"stream": true for SSE. Automatic prompt caching.
Errors
| Code | Trigger | Billed? |
|---|---|---|
| 400 / 401 / 402 / 422 / 429 | Standard | No |
| 5xx | Upstream | No (auto-retry) |
When to use
- New Chinese-market projects with mixed text + vision input.
- Tool-using agents in CJK languages.
- For prior version (no vision), see GLM-5.
- For longer context, see Kimi K2.6.
- For Chinese code-heavy work, see DeepSeek V4 Pro.
Limits
| Limit | Value |
|---|---|
| Context window | 128K tokens |
| Max output | 16384 tokens |
| Supports tool use | Yes (parallel) |
| Supports vision | Yes |
| Supports streaming | Yes |
| Supports prompt caching | Automatic |