GLM-4.6

GLM-4.6 is the latest version of Z.ai's flagship model, bringing significant improvements over GLM-4.5. Key features include: 200K token context window (expanded from 128K), superior coding performance with better real-world application in Claude Code/Cline/Roo Code/Kilo Code, advanced reasoning with tool use during inference, stronger agent capabilities, and refined writing aligned with human preferences. GLM-4.6 achieves competitive performance with DeepSeek-V3.2-Exp and Claude Sonnet 4, reaching near parity with Claude Sonnet 4 (48.6% win rate) on CC-Bench real-world coding tasks.

Benchmark results

Benchmark	Score	Tags	Source
AIME 2025	93.9%	self-reported llm-stats	link →
BrowseComp	45.1%	self-reported llm-stats	link →
GPQA	81.0%	self-reported llm-stats	link →
HLE	17.2%	self-reported llm-stats	link →
Humanity's Last Exam	17.2%	self-reported llm-stats	link →
LiveCodeBench v6	82.8%	self-reported llm-stats	link →
SWE-Bench Verified	68.0%	self-reported llm-stats	link →
Terminal-Bench	40.5%	self-reported llm-stats	link →