DeepSeek R1 Distill Llama 8B

DeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks.

MATH-500

89.1%

i
AIME 2024

80.0%

i
GPQA

49.0%

i
LiveCodeBench

39.6%

i