MiMo-V2.5-Pro

MiMo-V2.5-Pro is Xiaomi's 1.02T-parameter sparse Mixture-of-Experts language model with 42B active parameters and a 1M-token context window. It inherits the MiMo-V2-Flash hybrid-attention and Multi-Token Prediction design, extends context during pre-training up to 1M tokens, and uses supervised fine-tuning, domain-specialized reinforcement learning, and Multi-Teacher On-Policy Distillation to improve complex software engineering, long-horizon agentic tasks, and ultra-long-context coherence.

GSM8k i

99.6%

source →
ARC-C i

97.2%

source →
MMLU-Redux i

92.8%

source →
C-Eval i

91.5%

source →
CMMLU i

90.2%

source →
HellaSwag i

89.8%

source →
MMLU i

89.4%

source →
BBH i

88.4%

source →
DROP i

86.3%

source →
MATH i

86.2%

source →
Winogrande i

85.6%

source →
Global-MMLU i

83.6%

source →
TriviaQA i

81.3%

source →
SWE-Bench Verified i

78.9%

source →
HumanEval+ i

75.6%

source →
MBPP+ i

74.1%

source →
MiMo Coding Bench i

73.7%

source →
TAU3-Bench i

72.9%

source →
MMLU-Pro i

68.5%

source →
Terminal-Bench 2.0 i

68.4%

source →
GPQA i

66.7%

source →
Claw-Eval i

64.0%

source →
GraphWalks i

62.0%

source →
SWE-Bench Pro i

57.2%

source →
WildClawBench i

43.0%

source →
LiveCodeBench v6 i

39.6%

source →
AIME i

37.3%

source →
SWE-bench Verified (Agentless) i

35.7%

source →
Humanity's Last Exam i

34.0%

source →
GDPval-AA i

1,581

source →
FrontierSWE (Impl.) i

3.4

source →

Pricing, uptime, and speed via OpenRouter — updated Jun 12, 2026, 04:59 AM.

Provider	Status	Input	Output	Limits	Uptime	Speed	Notes
Xiaomi	available	$0.43/Mtok cache $0.00/Mtok	$0.87/Mtok	1.0M tokens context 131K tokens max output	99.9% 5m 99.9%	2,192 ms p50 TTFT 33 tok/s p50	fp8
DeepInfra	available	$1.00/Mtok cache $0.20/Mtok	$3.00/Mtok	1.0M tokens context 16K tokens max output	98% 5m 100.0%	2,145 ms p50 TTFT 63 tok/s p50	fp8