Qwen2.5 72B Instruct

Qwen2.5-72B-Instruct is an instruction-tuned 72 billion parameter language model, part of the Qwen2.5 series. It is designed to follow instructions, generate long texts (over 8K tokens), understand structured data (e.g., tables), and generate structured outputs, especially JSON.

GSM8k

95.8%

i
MBPP

88.2%

i
MMLU-Redux

86.8%

i
HumanEval

86.6%

i
IFEval

84.1%

i
MATH

83.1%

i
AlignBench

81.6%

i
Arena Hard

81.2%

i
MultiPL-E

75.1%

i
MMLU-Pro

71.1%

i
LiveCodeBench

55.5%

i
LiveBench

52.3%

i
GPQA

49.0%

i
MT-Bench

0.935

i

Pricing, uptime, and speed via OpenRouter — updated Jul 17, 2026, 04:19 AM.

Provider	Status	Input	Output	Limits	Uptime	Speed	Notes
DeepInfra	available	$0.36/Mtok	$0.40/Mtok	33K tokens context 16K tokens max output	99.9% 5m 100.0%	329 ms p50 TTFT 29 tok/s p50	fp8
Novita	available	$0.38/Mtok	$0.40/Mtok	32K tokens context 8K tokens max output	—	20,715 ms p50 TTFT 11 tok/s p50	bf16