DeepSeek R1 Distill Qwen 1.5B

DeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks.

Benchmark results

Benchmark Score Tags Source
AIME 2024 52.7% self-reported llm-stats link →
GPQA 33.8% self-reported llm-stats link →
LiveCodeBench 16.9% self-reported llm-stats link →
MATH-500 83.9% self-reported llm-stats link →