Claude 3.5 Sonnet

Claude 3.5 Sonnet is a powerful AI model. It excels in graduate-level reasoning, undergraduate-level knowledge, and coding proficiency, with improved understanding of nuance, humor, and complex instructions.

Benchmark results

Benchmark Score Tags Source
BIG-Bench Hard 93.1% self-reported llm-stats link →
DROP 87.1% self-reported llm-stats link →
GPQA 59.4% self-reported llm-stats link →
GSM8k 96.4% self-reported llm-stats link →
HumanEval 92.0% self-reported llm-stats link →
MATH 71.1% self-reported llm-stats link →
MGSM 91.6% self-reported llm-stats link →
MMLU 90.4% self-reported llm-stats link →
MMLU-Pro 76.1% self-reported llm-stats link →