DeepSeek R1 Distill Qwen 1.5B

DeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks.

Benchmark results

Benchmark	Score	Tags	Source
AIME 2024	52.7%	self-reported llm-stats	link →
GPQA	33.8%	self-reported llm-stats	link →
LiveCodeBench	16.9%	self-reported llm-stats	link →
MATH-500	83.9%	self-reported llm-stats	link →