DeepSeek-V3.2
DeepSeek-V3.2 is a 685B-parameter MoE model that harmonizes high computational efficiency with superior reasoning and agent performance. It introduces DeepSeek Sparse Attention (DSA) for efficient long-context processing, a scalable reinforcement learning post-training framework, and large-scale agentic task synthesis covering 1,800+ environments. V3.2 achieves GPT-5-level performance across reasoning, coding, and agentic benchmarks, with gold-medal results from its Speciale variant on IMO, IOI, ICPC World Finals, and CMO 2025.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2025 | 93.1% | self-reported llm-stats | link → |
| BrowseComp | 51.4% | self-reported llm-stats | link → |
| BrowseComp-zh | 65.0% | self-reported llm-stats | link → |
| CodeForces | 79.5% | self-reported llm-stats | link → |
| GPQA | 82.4% | self-reported llm-stats | link → |
| HMMT 2025 | 90.2% | self-reported llm-stats | link → |
| Humanity's Last Exam | 40.8% | self-reported llm-stats | link → |
| IMO-AnswerBench | 78.3% | self-reported llm-stats | link → |
| LiveCodeBench | 83.3% | self-reported llm-stats | link → |
| MCP-Mark | 38.0% | self-reported llm-stats | link → |
| MCP-Universe | 45.9% | self-reported llm-stats | link → |
| MMLU-Pro | 85.0% | self-reported llm-stats | link → |
| SWE-bench Multilingual | 70.2% | self-reported llm-stats | link → |
| SWE-Bench Verified | 73.1% | self-reported llm-stats | link → |
| t2-bench | 80.3% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 46.4% | self-reported llm-stats | link → |
| Toolathlon | 35.2% | self-reported llm-stats | link → |