MiniMax M2.1

MiniMax M2.1 is an enhanced large language model focused on multi-language programming and real-world complex tasks. It features exceptional capabilities across Rust, Java, Golang, C++, Kotlin, Objective-C, TypeScript, JavaScript and more, with industry-leading multilingual performance that outperforms Claude Sonnet 4.5 and approaches Claude Opus 4.5. M2.1 significantly strengthens native Android and iOS development, delivers enhanced design comprehension and aesthetic expression for web/app scenarios, and provides more concise responses with improved speed and reduced token consumption. It excels across various coding agent frameworks including Claude Code, Droid (Factory AI), Cline, Kilo Code, Roo Code, and BlackBox.

Benchmark results

Benchmark Score Tags Source
AA-LCR 62.0% self-reported llm-stats link →
AIME 2025 81.0% self-reported llm-stats link →
BrowseComp 62.0% self-reported llm-stats link →
GPQA 81.0% self-reported llm-stats link →
Humanity's Last Exam 22.0% self-reported llm-stats link →
IFBench 70.0% self-reported llm-stats link →
LiveCodeBench 78.0% self-reported llm-stats link →
MMLU-Pro 88.0% self-reported llm-stats link →
Multi-SWE-Bench 49.4% self-reported llm-stats link →
OctoCodingBench 26.1% self-reported llm-stats link →
SciCode 39.0% self-reported llm-stats link →
SWE-bench Multilingual 72.5% self-reported llm-stats link →
SWE-Bench Verified 67.0% self-reported llm-stats link →
SWE-Perf 3.1% self-reported llm-stats link →
SWE-Review 8.9% self-reported llm-stats link →
SWT-Bench 69.3% self-reported llm-stats link →
Tau2 Telecom 87.0% self-reported llm-stats link →
Terminal-Bench 47.9% self-reported llm-stats link →
Toolathlon 43.5% self-reported llm-stats link →
VIBE 88.6% self-reported llm-stats link →
VIBE Android 89.7% self-reported llm-stats link →
VIBE Backend 86.7% self-reported llm-stats link →
VIBE iOS 88.0% self-reported llm-stats link →
VIBE Simulation 87.1% self-reported llm-stats link →
VIBE Web 91.5% self-reported llm-stats link →