Available GLM models
| Model | Context | Strengths |
|---|---|---|
| GLM-5.1 | 128K | Latest generation, strongest reasoning |
| GLM-5 | 128K | Strong general-purpose chat |
| GLM-4.7 | 128K | Cost-effective, fast inference |
Basic chat completion
Streaming with Vercel AI SDK
Structured output
Comparing GLM generations
Choosing the right GLM model
| Use case | Model | Reason |
|---|---|---|
| Complex reasoning | GLM-5.1 | Strongest in the family |
| General chat | GLM-5 | Good balance of quality and speed |
| High-volume, cost-sensitive | GLM-4.7 | Fastest, lowest cost per token |
Tips
- GLM-5.1 is the recommended default unless you need cost savings.
- Multilingual: GLM models handle Chinese and English equally well.
- Temperature 0.3–0.5 works best for factual tasks; 0.7–0.9 for creative writing.
Next steps
- Chat completions reference
- Model catalog
- AI Chatbot with Next.js — build a chat UI with GLM