MiniMax M2.5

MiniMax M2.5 is the world's first production-level model designed natively for Agent scenarios. Building on the M2.1 foundation, M2.5 delivers significant improvements in programming, tool calling, search, and office productivity. With only 10B activation parameters from its 230B MoE architecture, it achieves competitive performance against top international models like Claude Opus 4.6 while maintaining high throughput and efficient inference. M2.5 supports full-stack development for PC, App, and cross-platform applications, and excels in agentic workflows including automated customer support, data-analysis pipelines, and complex task execution.

Benchmark results

Benchmark Score Tags Source
BFCL_v3_MultiTurn 76.8% self-reported llm-stats link →
BrowseComp 76.3% self-reported llm-stats link →
GDPval-MM 59.0% self-reported llm-stats link →
MEWC 74.4% self-reported llm-stats link →
Multi-SWE-Bench 51.3% self-reported llm-stats link →
SWE-Bench Pro 55.4% self-reported llm-stats link →
SWE-Bench Verified 80.2% self-reported llm-stats link →
VIBE-Pro 54.2% self-reported llm-stats link →