Qwen2 72B Instruct

Qwen2-72B-Instruct is an instruction-tuned language model with 72 billion parameters, supporting a context length of up to 131,072 tokens. It's part of the new Qwen2 series, which has surpassed most open-source models and demonstrates competitiveness against proprietary models across various benchmarks.

Benchmark results

Benchmark Score Tags Source
ARC-C 68.9% self-reported llm-stats link →
BBH 82.4% self-reported llm-stats link →
C-Eval 83.8% self-reported llm-stats link →
CMMLU 90.1% self-reported llm-stats link →
EvalPlus 79.0% self-reported llm-stats link →
GPQA 42.4% self-reported llm-stats link →
GSM8k 91.1% self-reported llm-stats link →
HellaSwag 87.6% self-reported llm-stats link →
HumanEval 86.0% self-reported llm-stats link →
MATH 59.7% self-reported llm-stats link →
MBPP 80.2% self-reported llm-stats link →
MMLU 82.3% self-reported llm-stats link →
MMLU-Pro 64.4% self-reported llm-stats link →
MultiPL-E 69.2% self-reported llm-stats link →
TheoremQA 44.4% self-reported llm-stats link →
TruthfulQA 54.8% self-reported llm-stats link →
Winogrande 85.1% self-reported llm-stats link →