Magistral Medium
Trained solely with reinforcement learning on top of Mistral Medium 3, Magistral Medium is a reasoning model that achieves strong performance on complex math and code tasks without relying on distillation from existing reasoning models. The training uses an RLVR framework with modifications to GRPO, enabling improved reasoning ability and multilingual consistency.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| Aider-Polyglot | 47.1% | self-reported llm-stats | link → |
| AIME 2024 | 73.6% | self-reported llm-stats | link → |
| AIME 2025 | 64.9% | self-reported llm-stats | link → |
| GPQA | 70.8% | self-reported llm-stats | link → |
| Humanity's Last Exam | 9.0% | self-reported llm-stats | link → |
| LiveCodeBench | 50.3% | self-reported llm-stats | link → |