Claude 3 Opus
Claude 3 Opus is Anthropic's most intelligent model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding, showing the outer limits of what's possible with generative AI.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| ARC-C | 96.4% | self-reported llm-stats | link → |
| BIG-Bench Hard | 86.8% | self-reported llm-stats | link → |
| DROP | 83.1% | self-reported llm-stats | link → |
| GPQA | 50.4% | self-reported llm-stats | link → |
| GSM8k | 95.0% | self-reported llm-stats | link → |
| HellaSwag | 95.4% | self-reported llm-stats | link → |
| HumanEval | 84.9% | self-reported llm-stats | link → |
| MATH | 60.1% | self-reported llm-stats | link → |
| MGSM | 90.7% | self-reported llm-stats | link → |
| MMLU | 86.8% | self-reported llm-stats | link → |
| MMLU-Pro | 68.5% | self-reported llm-stats | link → |