DeepSeek-V3
A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| Aider-Polyglot | 49.6% | self-reported llm-stats | link → |
| Aider-Polyglot Edit | 79.7% | self-reported llm-stats | link → |
| AIME 2024 | 39.2% | self-reported llm-stats | link → |
| C-Eval | 86.5% | self-reported llm-stats | link → |
| CLUEWSC | 90.9% | self-reported llm-stats | link → |
| CNMO 2024 | 43.2% | self-reported llm-stats | link → |
| CSimpleQA | 64.8% | self-reported llm-stats | link → |
| DROP | 91.6% | self-reported llm-stats | link → |
| FRAMES | 73.3% | self-reported llm-stats | link → |
| GPQA | 59.1% | self-reported llm-stats | link → |
| HumanEval-Mul | 82.6% | self-reported llm-stats | link → |
| IFEval | 86.1% | self-reported llm-stats | link → |
| LiveCodeBench | 37.6% | self-reported llm-stats | link → |
| LongBench v2 | 48.7% | self-reported llm-stats | link → |
| MATH-500 | 90.2% | self-reported llm-stats | link → |
| MMLU | 88.5% | self-reported llm-stats | link → |
| MMLU-Pro | 75.9% | self-reported llm-stats | link → |
| MMLU-Redux | 89.1% | self-reported llm-stats | link → |
| SimpleQA | 24.9% | self-reported llm-stats | link → |
| SWE-Bench Verified | 42.0% | self-reported llm-stats | link → |