Qwen2.5-Coder 32B Instruct
Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K context window. It excels in code generation, completion, repair, and multi-programming tasks while maintaining strong performance in mathematics and general capabilities.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| ARC-C | 70.5% | self-reported llm-stats | link → |
| BigCodeBench-Full | 49.6% | self-reported llm-stats | link → |
| BigCodeBench-Hard | 27.0% | self-reported llm-stats | link → |
| GSM8k | 91.1% | self-reported llm-stats | link → |
| HellaSwag | 83.0% | self-reported llm-stats | link → |
| HumanEval | 92.7% | self-reported llm-stats | link → |
| LiveCodeBench | 31.4% | self-reported llm-stats | link → |
| MATH | 57.2% | self-reported llm-stats | link → |
| MBPP | 90.2% | self-reported llm-stats | link → |
| MMLU | 75.1% | self-reported llm-stats | link → |
| MMLU-Pro | 50.4% | self-reported llm-stats | link → |
| MMLU-Redux | 77.5% | self-reported llm-stats | link → |
| TheoremQA | 43.1% | self-reported llm-stats | link → |
| TruthfulQA | 54.2% | self-reported llm-stats | link → |
| Winogrande | 80.8% | self-reported llm-stats | link → |