Claude Sonnet 4

Claude Sonnet 4, part of the Claude 4 family, is a significant upgrade to Claude Sonnet 3.7. It excels in coding (72.7% on SWE-bench) and reasoning, responding more precisely to instructions. Sonnet 4 offers an optimal mix of capability and practicality, with enhanced steerability, and supports extended thinking with tool use.

Benchmark results

Benchmark Score Tags Source
AIME 2025 70.5% self-reported llm-stats link →
GPQA 75.4% self-reported llm-stats link →
MMMLU 86.5% self-reported llm-stats link →
MMMU 74.4% self-reported llm-stats link →
SWE-Bench Verified 72.7% self-reported llm-stats link →
TAU-bench Airline 60.0% self-reported llm-stats link →
TAU-bench Retail 80.5% self-reported llm-stats link →
Terminal-Bench 35.5% self-reported llm-stats link →