Qwen2.5 72B Instruct

Qwen2.5-72B-Instruct is an instruction-tuned 72 billion parameter language model, part of the Qwen2.5 series. It is designed to follow instructions, generate long texts (over 8K tokens), understand structured data (e.g., tables), and generate structured outputs, especially JSON. The model supports multilingual capabilities across over 29 languages.

Benchmark results

Benchmark Score Tags Source
AlignBench 81.6% self-reported llm-stats link →
Arena Hard 81.2% self-reported llm-stats link →
GPQA 49.0% self-reported llm-stats link →
GSM8k 95.8% self-reported llm-stats link →
HumanEval 86.6% self-reported llm-stats link →
IFEval 84.1% self-reported llm-stats link →
LiveBench 52.3% self-reported llm-stats link →
LiveCodeBench 55.5% self-reported llm-stats link →
MATH 83.1% self-reported llm-stats link →
MBPP 88.2% self-reported llm-stats link →
MMLU-Pro 71.1% self-reported llm-stats link →
MMLU-Redux 86.8% self-reported llm-stats link →
MT-Bench 0.935 self-reported llm-stats link →
MultiPL-E 75.1% self-reported llm-stats link →