Qwen3.5-397B-A17B
Qwen3.5-397B-A17B is Qwen's flagship Mixture-of-Experts model with 397 billion total parameters and 17 billion activated parameters. It delivers state-of-the-art performance across knowledge, reasoning, coding, mathematics, multilingual understanding, instruction following, long context, and agent tasks.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AA-LCR | 68.7% | self-reported llm-stats | link → |
| AIME 2026 | 91.3% | self-reported llm-stats | link → |
| BFCL-V4 | 72.9% | self-reported llm-stats | link → |
| BrowseComp | 69.0% | self-reported llm-stats | link → |
| BrowseComp-zh | 70.3% | self-reported llm-stats | link → |
| C-Eval | 93.0% | self-reported llm-stats | link → |
| DeepPlanning | 34.3% | self-reported llm-stats | link → |
| Global PIQA | 89.8% | self-reported llm-stats | link → |
| GPQA | 88.4% | self-reported llm-stats | link → |
| HMMT 2025 | 94.8% | self-reported llm-stats | link → |
| HMMT25 | 92.7% | self-reported llm-stats | link → |
| Humanity's Last Exam | 28.7% | self-reported llm-stats | link → |
| IFBench | 76.5% | self-reported llm-stats | link → |
| IFEval | 92.6% | self-reported llm-stats | link → |
| IMO-AnswerBench | 80.9% | self-reported llm-stats | link → |
| Include | 85.6% | self-reported llm-stats | link → |
| LiveCodeBench v6 | 83.6% | self-reported llm-stats | link → |
| LongBench v2 | 63.2% | self-reported llm-stats | link → |
| MAXIFE | 88.2% | self-reported llm-stats | link → |
| MCP-Mark | 46.1% | self-reported llm-stats | link → |
| MMLU-Pro | 87.8% | self-reported llm-stats | link → |
| MMLU-ProX | 84.7% | self-reported llm-stats | link → |
| MMLU-Redux | 94.9% | self-reported llm-stats | link → |
| MMMLU | 88.5% | self-reported llm-stats | link → |
| Multi-Challenge | 67.6% | self-reported llm-stats | link → |
| NOVA-63 | 59.1% | self-reported llm-stats | link → |
| PolyMATH | 73.3% | self-reported llm-stats | link → |
| Seal-0 | 46.9% | self-reported llm-stats | link → |
| SecCodeBench | 68.3% | self-reported llm-stats | link → |
| SuperGPQA | 70.4% | self-reported llm-stats | link → |
| SWE-bench Multilingual | 69.3% | self-reported llm-stats | link → |
| SWE-Bench Verified | 76.4% | self-reported llm-stats | link → |
| t2-bench | 86.7% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 52.5% | self-reported llm-stats | link → |
| Toolathlon | 38.3% | self-reported llm-stats | link → |
| VITA-Bench | 49.7% | self-reported llm-stats | link → |
| WideSearch | 74.0% | self-reported llm-stats | link → |
| WMT24++ | 78.9% | self-reported llm-stats | link → |