Qwen2 72B Instruct
Qwen2-72B-Instruct is an instruction-tuned language model with 72 billion parameters, supporting a context length of up to 131,072 tokens. It's part of the new Qwen2 series, which has surpassed most open-source models and demonstrates competitiveness against proprietary models across various benchmarks.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| ARC-C | 68.9% | self-reported llm-stats | link → |
| BBH | 82.4% | self-reported llm-stats | link → |
| C-Eval | 83.8% | self-reported llm-stats | link → |
| CMMLU | 90.1% | self-reported llm-stats | link → |
| EvalPlus | 79.0% | self-reported llm-stats | link → |
| GPQA | 42.4% | self-reported llm-stats | link → |
| GSM8k | 91.1% | self-reported llm-stats | link → |
| HellaSwag | 87.6% | self-reported llm-stats | link → |
| HumanEval | 86.0% | self-reported llm-stats | link → |
| MATH | 59.7% | self-reported llm-stats | link → |
| MBPP | 80.2% | self-reported llm-stats | link → |
| MMLU | 82.3% | self-reported llm-stats | link → |
| MMLU-Pro | 64.4% | self-reported llm-stats | link → |
| MultiPL-E | 69.2% | self-reported llm-stats | link → |
| TheoremQA | 44.4% | self-reported llm-stats | link → |
| TruthfulQA | 54.8% | self-reported llm-stats | link → |
| Winogrande | 85.1% | self-reported llm-stats | link → |