Qwen2.5 14B Instruct

Qwen2.5-14B-Instruct is an instruction-tuned 14.7B parameter language model, part of the Qwen2.5 series. It features significant improvements in instruction following, long text generation (8K+ tokens), structured data understanding, and JSON output generation. The model supports a 128K token context length and multilingual capabilities across 29+ languages including Chinese, English, French, Spanish, and more.

Benchmark results

Benchmark Score Tags Source
ARC-C 67.3% self-reported llm-stats link →
BBH 78.2% self-reported llm-stats link →
GPQA 45.5% self-reported llm-stats link →
GSM8k 94.8% self-reported llm-stats link →
HumanEval 83.5% self-reported llm-stats link →
HumanEval+ 51.2% self-reported llm-stats link →
MATH 80.0% self-reported llm-stats link →
MBPP 82.0% self-reported llm-stats link →
MBPP+ 63.2% self-reported llm-stats link →
MMLU 79.7% self-reported llm-stats link →
MMLU-Pro 63.7% self-reported llm-stats link →
MMLU-Redux 80.0% self-reported llm-stats link →
MMLU-STEM 76.4% self-reported llm-stats link →
MultiPL-E 72.8% self-reported llm-stats link →
TheoremQA 43.0% self-reported llm-stats link →
TruthfulQA 58.4% self-reported llm-stats link →