Qwen3.5-397B-A17B

Qwen3.5-397B-A17B is Qwen's flagship Mixture-of-Experts model with 397 billion total parameters and 17 billion activated parameters. It delivers state-of-the-art performance across knowledge, reasoning, coding, mathematics, multilingual understanding, instruction following, long context, and agent tasks.

MMLU-Redux

94.9%

i
HMMT 2025

94.8%

i
C-Eval

93.0%

i
HMMT25

92.7%

i
IFEval

92.6%

i
AIME 2026

91.3%

i
Global PIQA

89.8%

i
MMMLU

88.5%

i
GPQA

88.4%

i
MAXIFE

88.2%

i
MMLU-Pro

87.8%

i
t2-bench

86.7%

i
Include

85.6%

i
MMLU-ProX

84.7%

i
LiveCodeBench v6

83.6%

i
IMO-AnswerBench

80.9%

i
WMT24++

78.9%

i
IFBench

76.5%

i
SWE-Bench Verified

76.4%

i
WideSearch

74.0%

i
PolyMATH

73.3%

i
BFCL-V4

72.9%

i
SuperGPQA

70.4%

i
BrowseComp-zh

70.3%

i
SWE-bench Multilingual

69.3%

i
BrowseComp

69.0%

i
AA-LCR

68.7%

i
SecCodeBench

68.3%

i
Multi-Challenge

67.6%

i
LongBench v2

63.2%

i
NOVA-63

59.1%

i
Terminal-Bench 2.0

52.5%

i
VITA-Bench

49.7%

i
Seal-0

46.9%

i
MCP-Mark

46.1%

i
Toolathlon

38.3%

i
DeepPlanning

34.3%

i
Humanity's Last Exam

28.7%

i

Pricing, uptime, and speed via OpenRouter — updated Jul 17, 2026, 04:19 AM.

Provider	Status	Input	Output	Limits	Uptime	Speed	Notes
Alibaba	available	$0.39/Mtok	$2.34/Mtok	262K tokens context 66K tokens max output	99.8% 5m 99.7%	1,116 ms p50 TTFT 17 tok/s p50
Chutes	available	$0.45/Mtok cache $0.22/Mtok	$3.00/Mtok	262K tokens context 66K tokens max output	99.9% 5m 100.0%	2,259 ms p50 TTFT 31 tok/s p50	fp8
DeepInfra	available	$0.45/Mtok cache $0.22/Mtok	$3.00/Mtok	262K tokens context 82K tokens max output	100.0% 5m 100.0%	1,420 ms p50 TTFT 27 tok/s p50	fp8
Parasail	available	$0.50/Mtok cache $0.30/Mtok	$3.60/Mtok	262K tokens context 262K tokens max output	100.0% 5m 100.0%	1,260 ms p50 TTFT 79 tok/s p50	fp8
AtlasCloud	available	$0.55/Mtok cache $0.55/Mtok	$3.50/Mtok	262K tokens context 66K tokens max output	99% 5m 100.0%	2,057 ms p50 TTFT 61 tok/s p50	fp8
Phala	available	$0.55/Mtok cache $0.22/Mtok	$3.50/Mtok	262K tokens context 262K tokens max output	99.9% 5m 100.0%	2,765 ms p50 TTFT 31 tok/s p50
Novita	available	$0.60/Mtok	$3.60/Mtok	262K tokens context 66K tokens max output	99% 5m 99%	2,351 ms p50 TTFT 62 tok/s p50
Venice	available	$0.75/Mtok	$4.50/Mtok	128K tokens context 33K tokens max output	—	1,456 ms p50 TTFT 43 tok/s p50
DigitalOcean	-5	$0.39/Mtok cache $0.11/Mtok	$2.45/Mtok	131K tokens context 66K tokens max output	3%	—
GMICloud	-5	$0.60/Mtok	$3.60/Mtok	262K tokens context 66K tokens max output	—	—	fp8
StreamLake	-2	$0.60/Mtok cache $0.12/Mtok	$3.60/Mtok	256K tokens context 64K tokens max output	89% 5m 92%	2,657 ms p50 TTFT 89 tok/s p50