MiMo-V2.5-Pro is Xiaomi's 1.02T-parameter sparse Mixture-of-Experts language model with 42B active parameters and a 1M-token context window. It inherits the MiMo-V2-Flash hybrid-attention and Multi-Token Prediction design, extends context during pre-training up to 1M tokens, and uses supervised fine-tuning, domain-specialized reinforcement learning, and Multi-Teacher On-Policy Distillation to improve complex software engineering, long-horizon agentic tasks, and ultra-long-context coherence.