Magistral Medium

Trained solely with reinforcement learning on top of Mistral Medium 3, Magistral Medium is a reasoning model that achieves strong performance on complex math and code tasks without relying on distillation from existing reasoning models. The training uses an RLVR framework with modifications to GRPO, enabling improved reasoning ability and multilingual consistency.

Benchmark results

Benchmark Score Tags Source
Aider-Polyglot 47.1% self-reported llm-stats link →
AIME 2024 73.6% self-reported llm-stats link →
AIME 2025 64.9% self-reported llm-stats link →
GPQA 70.8% self-reported llm-stats link →
Humanity's Last Exam 9.0% self-reported llm-stats link →
LiveCodeBench 50.3% self-reported llm-stats link →