Claude 3.5 Sonnet

Claude 3.5 Sonnet is a powerful AI model with industry-leading software engineering skills. It excels in coding, planning, and problem-solving, with significant improvements in agentic coding and tool use tasks.

GSM8k

96.4%

i
DocVQA

95.2%

i
AI2D

94.7%

i
HumanEval

93.7%

i
BIG-Bench Hard

93.1%

i
MGSM

91.6%

i
ChartQA

90.8%

i
MMLU

90.4%

i
DROP

87.1%

i
MATH

78.3%

i
MMLU-Pro

77.6%

i
TAU-bench Retail

69.2%

i
MMMU

68.3%

i
MathVista

67.7%

i
GPQA

67.2%

i
SWE-Bench Verified

49.0%

i
TAU-bench Airline

46.0%

i
OSWorld Extended

22.0%

i
OSWorld Screenshot-only

14.9%

i