MiniMax M1 80K
MiniMax-M1 is an open-source, large-scale reasoning model that uses a hybrid-attention architecture for efficient long-context processing. It supports up to a 1 million token context window and 80,000-token reasoning output, matching Gemini 2.5 Pro’s scale while being highly cost-effective. Its Lightning Attention mechanism reduces compute requirements to about 30% of DeepSeek R1’s, and a new reinforcement learning algorithm, CISPO, doubles convergence speed compared to other RL methods. Trained on 512 H800s over three weeks, M1 achieves near state-of-the-art results across software engineering, long-context, and tool-use benchmarks, outperforming most open models and rivaling top closed systems.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2024 | 86.0% | self-reported llm-stats | link → |
| AIME 2025 | 76.9% | self-reported llm-stats | link → |
| GPQA | 70.0% | self-reported llm-stats | link → |
| Humanity's Last Exam | 8.4% | self-reported llm-stats | link → |
| LiveCodeBench | 65.0% | self-reported llm-stats | link → |
| LongBench v2 | 61.5% | self-reported llm-stats | link → |
| MATH-500 | 96.8% | self-reported llm-stats | link → |
| MMLU-Pro | 81.1% | self-reported llm-stats | link → |
| Multi-Challenge | 44.7% | self-reported llm-stats | link → |
| OpenAI-MRCR: 2 needle 128k | 73.4% | self-reported llm-stats | link → |
| OpenAI-MRCR: 2 needle 1M | 56.2% | self-reported llm-stats | link → |
| SimpleQA | 18.5% | self-reported llm-stats | link → |
| SWE-Bench Verified | 56.0% | self-reported llm-stats | link → |
| TAU-bench Airline | 62.0% | self-reported llm-stats | link → |
| TAU-bench Retail | 63.5% | self-reported llm-stats | link → |
| ZebraLogic | 86.8% | self-reported llm-stats | link → |