MiniMax M2
MiniMax M2 is an open-source large language model by MiniMax, built for agents and coding tasks. It delivers state-of-the-art tool use, reasoning, and search performance while maintaining exceptional cost-efficiency and speed, priced at just 8% of Claude 3.5 Sonnet’s cost and running at nearly double its inference speed (≈100 TPS). Designed for end-to-end agentic workflows, it excels at long-chain tool calling across Shell, Browser, Python, and other MCP tools. While slightly behind top overseas models in programming, it ranks among the best domestic models and top five globally on the Artificial Analysis benchmark. M2 powers the MiniMax Agent platform, available in Lightning Mode for fast tasks and Pro Mode for complex multi-step reasoning, and its weights, API, and deployment guides are freely available on Hugging Face, vLLM, and SGLang.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AA-Index | 61.0% | self-reported llm-stats | link → |
| AIME 2025 | 78.0% | self-reported llm-stats | link → |
| BrowseComp | 44.0% | self-reported llm-stats | link → |
| BrowseComp-zh | 48.5% | self-reported llm-stats | link → |
| GPQA | 78.0% | self-reported llm-stats | link → |
| Humanity's Last Exam | 12.5% | self-reported llm-stats | link → |
| IF | 72.0% | self-reported llm-stats | link → |
| LiveCodeBench | 83.0% | self-reported llm-stats | link → |
| MMLU-Pro | 82.0% | self-reported llm-stats | link → |
| Multi-SWE-Bench | 36.2% | self-reported llm-stats | link → |
| SciCode | 36.0% | self-reported llm-stats | link → |
| SWE-bench Multilingual | 56.5% | self-reported llm-stats | link → |
| SWE-Bench Verified | 69.4% | self-reported llm-stats | link → |
| Tau-bench | 77.2% | self-reported llm-stats | link → |
| Tau2 Telecom | 87.0% | self-reported llm-stats | link → |
| Terminal-Bench | 46.3% | self-reported llm-stats | link → |