DeepSeek-R1-0528

DeepSeek-R1-0528 is the May 28, 2025 version of DeepSeek's reasoning model. It features advanced thinking capabilities and serves as a benchmark comparison for newer models like DeepSeek-V3.1. This model excels in complex reasoning tasks, mathematical problem-solving, and code generation through its thinking mode approach.

Benchmark results

Benchmark Score Tags Source
Aider-Polyglot 71.6% self-reported llm-stats link →
AIME 2024 91.4% self-reported llm-stats link →
AIME 2025 87.5% self-reported llm-stats link →
BrowseComp 8.9% self-reported llm-stats link →
BrowseComp-zh 35.7% self-reported llm-stats link →
CodeForces 64.3% self-reported llm-stats link →
GPQA 81.0% self-reported llm-stats link →
HMMT 2025 79.4% self-reported llm-stats link →
Humanity's Last Exam 17.7% self-reported llm-stats link →
LiveCodeBench 73.3% self-reported llm-stats link →
MMLU-Pro 85.0% self-reported llm-stats link →
MMLU-Redux 93.4% self-reported llm-stats link →
SimpleQA 92.3% self-reported llm-stats link →
SWE-bench Multilingual 30.5% self-reported llm-stats link →
SWE-Bench Verified 44.6% self-reported llm-stats link →
Terminal-Bench 5.7% self-reported llm-stats link →