Claude 3.5 Sonnet

Mid-tier Anthropic model.

Benchmark results

Benchmark Score Tags Source
Chatbot Arena 1271
HumanEval 92.0%
MMLU 88.7%
MMMU 70.4%
SWE-Bench Verified 49.0%