GLM-4.7

GLM 4.7 is a coding‑centric model that thinks before acting, preserves its reasoning across turns, and lets you control thinking per request for speed or accuracy. It upgrades agentic workflows with stronger multi‑step tool use, better terminal and multilingual coding, and a noticeable jump in UI output quality for modern, clean webpages and slides. You can use it in popular coding agents, call it via the Z.ai API, and even run it locally with public weights on HuggingFace and ModelScope using vLLM or SGLang.

Benchmark results

Benchmark Score Tags Source
AIME 2025 95.7% self-reported llm-stats link →
BrowseComp 52.0% self-reported llm-stats link →
BrowseComp-zh 66.6% self-reported llm-stats link →
GPQA 85.7% self-reported llm-stats link →
Humanity's Last Exam 42.8% self-reported llm-stats link →
IMO-AnswerBench 82.0% self-reported llm-stats link →
LiveCodeBench v6 84.9% self-reported llm-stats link →
MMLU-Pro 84.3% self-reported llm-stats link →
SWE-bench Multilingual 66.7% self-reported llm-stats link →
SWE-Bench Verified 73.8% self-reported llm-stats link →
Tau-bench 87.4% self-reported llm-stats link →
Terminal-Bench 33.3% self-reported llm-stats link →
Terminal-Bench 2.0 41.0% self-reported llm-stats link →