MiMo-V2-Flash

MiMo-V2-Flash is a powerful, efficient, and ultra-fast foundation language model that excels in reasoning, coding, and agentic scenarios. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, featuring a hybrid attention architecture with sliding-window and full attention (5:1 ratio, 128-token window). Delivers 150 tokens/sec inference with 256k context window.

Benchmark results

Benchmark Score Tags Source
AIME 2025 94.1% self-reported llm-stats link →
Arena-Hard v2 86.2% self-reported llm-stats link →
BrowseComp 58.3% self-reported llm-stats link →
GPQA 83.7% self-reported llm-stats link →
HMMT 2025 84.4% self-reported llm-stats link →
Humanity's Last Exam 22.1% self-reported llm-stats link →
LiveCodeBench v6 80.6% self-reported llm-stats link →
LongBench v2 60.6% self-reported llm-stats link →
MMLU-Pro 84.9% self-reported llm-stats link →
MRCR 45.7% self-reported llm-stats link →
SWE-bench Multilingual 71.7% self-reported llm-stats link →
SWE-Bench Verified 73.4% self-reported llm-stats link →
Tau-bench 80.3% self-reported llm-stats link →
Terminal-Bench 30.5% self-reported llm-stats link →
Terminal-Bench 2.0 38.5% self-reported llm-stats link →