Qwen2.5-Coder 32B Instruct

Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K context window. It excels in code generation, completion, repair, and multi-programming tasks while maintaining strong performance in mathematics and general capabilities.

Benchmark results

Benchmark Score Tags Source
ARC-C 70.5% self-reported llm-stats link →
BigCodeBench-Full 49.6% self-reported llm-stats link →
BigCodeBench-Hard 27.0% self-reported llm-stats link →
GSM8k 91.1% self-reported llm-stats link →
HellaSwag 83.0% self-reported llm-stats link →
HumanEval 92.7% self-reported llm-stats link →
LiveCodeBench 31.4% self-reported llm-stats link →
MATH 57.2% self-reported llm-stats link →
MBPP 90.2% self-reported llm-stats link →
MMLU 75.1% self-reported llm-stats link →
MMLU-Pro 50.4% self-reported llm-stats link →
MMLU-Redux 77.5% self-reported llm-stats link →
TheoremQA 43.1% self-reported llm-stats link →
TruthfulQA 54.2% self-reported llm-stats link →
Winogrande 80.8% self-reported llm-stats link →