MAI-Code-1-Flash
MAI-Code-1-Flash is a Microsoft AI coding model built for fast, efficient assistance in everyday developer workflows, built end-to-end by Microsoft on clean and appropriately licensed data. It is trained directly with the GitHub Copilot harnesses used in production for agentic coding in real developer environments, and uses adaptive solution length control to stay concise on simple requests while spending more reasoning budget on complex tasks. It outperforms Claude Haiku 4.5 across coding benchmarks while using up to 60% fewer tokens, and is rolling out to GitHub Copilot individual users in Visual Studio Code via the model picker and the default Auto picker.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AdvancedIF | 71.4% | self-reported llm-stats | link → |
| AIME 2026 | 92.5% | self-reported llm-stats | link → |
| AMO Bench | 40.0% | self-reported llm-stats | link → |
| Artifacts Bench | 36.4% | self-reported llm-stats | link → |
| Frontier Science | 58.2% | self-reported llm-stats | link → |
| FrontierMath | 6.3% | self-reported llm-stats | link → |
| GPQA | 84.6% | self-reported llm-stats | link → |
| Humanity's Last Exam | 18.0% | self-reported llm-stats | link → |
| IFBench | 75.0% | self-reported llm-stats | link → |
| Robust IF | 61.2% | self-reported llm-stats | link → |
| SWE-bench Multilingual | 65.5% | self-reported llm-stats | link → |
| SWE-Bench Pro | 51.2% | self-reported llm-stats | link → |
| SWE-Bench Verified | 71.6% | self-reported llm-stats | link → |
| Tau2 Telecom | 71.7% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 54.8% | self-reported llm-stats | link → |