GLM-4.6
GLM-4.6 is the latest version of Z.ai's flagship model, bringing significant improvements over GLM-4.5. Key features include: 200K token context window (expanded from 128K), superior coding performance with better real-world application in Claude Code/Cline/Roo Code/Kilo Code, advanced reasoning with tool use during inference, stronger agent capabilities, and refined writing aligned with human preferences. GLM-4.6 achieves competitive performance with DeepSeek-V3.2-Exp and Claude Sonnet 4, reaching near parity with Claude Sonnet 4 (48.6% win rate) on CC-Bench real-world coding tasks.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2025 | 93.9% | self-reported llm-stats | link → |
| BrowseComp | 45.1% | self-reported llm-stats | link → |
| GPQA | 81.0% | self-reported llm-stats | link → |
| HLE | 17.2% | self-reported llm-stats | link → |
| Humanity's Last Exam | 17.2% | self-reported llm-stats | link → |
| LiveCodeBench v6 | 82.8% | self-reported llm-stats | link → |
| SWE-Bench Verified | 68.0% | self-reported llm-stats | link → |
| Terminal-Bench | 40.5% | self-reported llm-stats | link → |