GLM-4.6

GLM-4.6 is the latest version of Z.ai's flagship model, bringing significant improvements over GLM-4.5. Key features include: 200K token context window (expanded from 128K), superior coding performance with better real-world application in Claude Code/Cline/Roo Code/Kilo Code, advanced reasoning with tool use during inference, stronger agent capabilities, and refined writing aligned with human preferences. GLM-4.6 achieves competitive performance with DeepSeek-V3.2-Exp and Claude Sonnet 4, reaching near parity with Claude Sonnet 4 (48.6% win rate) on CC-Bench real-world coding tasks.

Benchmark results

Benchmark Score Tags Source
AIME 2025 93.9% self-reported llm-stats link →
BrowseComp 45.1% self-reported llm-stats link →
GPQA 81.0% self-reported llm-stats link →
HLE 17.2% self-reported llm-stats link →
Humanity's Last Exam 17.2% self-reported llm-stats link →
LiveCodeBench v6 82.8% self-reported llm-stats link →
SWE-Bench Verified 68.0% self-reported llm-stats link →
Terminal-Bench 40.5% self-reported llm-stats link →