Qwen3.5-27B

Qwen3.5-27B is a multimodal dense foundation model with 27 billion parameters. It combines strong reasoning, coding, multilingual, long-context, and visual understanding performance in a production-friendly open-weight package with a native 262K context window.

Benchmark results

Benchmark Score Tags Source
AA-LCR 66.1% self-reported llm-stats link →
AI2D 92.9% self-reported llm-stats link →
AndroidWorld_SR 64.2% self-reported llm-stats link →
BabyVision 44.6% self-reported llm-stats link →
BFCL-V4 68.5% self-reported llm-stats link →
BrowseComp 61.0% self-reported llm-stats link →
BrowseComp-zh 62.1% self-reported llm-stats link →
C-Eval 90.5% self-reported llm-stats link →
CC-OCR 81.0% self-reported llm-stats link →
CharXiv-R 79.5% self-reported llm-stats link →
CodeForces 80.7% self-reported llm-stats link →
CountBench 97.8% self-reported llm-stats link →
DeepPlanning 22.6% self-reported llm-stats link →
DynaMath 87.7% self-reported llm-stats link →
EmbSpatialBench 84.5% self-reported llm-stats link →
ERQA 60.5% self-reported llm-stats link →
FullStackBench en 60.1% self-reported llm-stats link →
FullStackBench zh 57.4% self-reported llm-stats link →
Global PIQA 87.5% self-reported llm-stats link →
GPQA 85.5% self-reported llm-stats link →
Hallusion Bench 70.0% self-reported llm-stats link →
HMMT 2025 92.0% self-reported llm-stats link →
HMMT25 89.8% self-reported llm-stats link →
Humanity's Last Exam 48.5% self-reported llm-stats link →
Hypersim 13.0% self-reported llm-stats link →
IFBench 76.5% self-reported llm-stats link →
IFEval 95.0% self-reported llm-stats link →
Include 81.6% self-reported llm-stats link →
LingoQA 82.0% self-reported llm-stats link →
LiveCodeBench v6 80.7% self-reported llm-stats link →
LongBench v2 60.6% self-reported llm-stats link →
LVBench 73.6% self-reported llm-stats link →
MathVision 86.0% self-reported llm-stats link →
MathVista-Mini 87.8% self-reported llm-stats link →
MAXIFE 88.0% self-reported llm-stats link →
MedXpertQA 62.4% self-reported llm-stats link →
MLVU 85.9% self-reported llm-stats link →
MMBench-V1.1 92.6% self-reported llm-stats link →
MMLongBench-Doc 60.2% self-reported llm-stats link →
MMLU-Pro 86.1% self-reported llm-stats link →
MMLU-ProX 82.2% self-reported llm-stats link →
MMLU-Redux 93.2% self-reported llm-stats link →
MMMLU 85.9% self-reported llm-stats link →
MMMU 82.3% self-reported llm-stats link →
MMMU-Pro 75.0% self-reported llm-stats link →
MMStar 81.0% self-reported llm-stats link →
MMVU 73.3% self-reported llm-stats link →
Multi-Challenge 60.8% self-reported llm-stats link →
MVBench 74.6% self-reported llm-stats link →
NOVA-63 58.1% self-reported llm-stats link →
Nuscene 15.2% self-reported llm-stats link →
OCRBench 89.4% self-reported llm-stats link →
ODinW 41.1% self-reported llm-stats link →
OJBench 40.1% self-reported llm-stats link →
OmniDocBench 1.5 88.9% self-reported llm-stats link →
OSWorld-Verified 56.2% self-reported llm-stats link →
PMC-VQA 62.4% self-reported llm-stats link →
PolyMATH 71.2% self-reported llm-stats link →
RealWorldQA 83.7% self-reported llm-stats link →
RefCOCO-avg 90.9% self-reported llm-stats link →
RefSpatialBench 67.7% self-reported llm-stats link →
ScreenSpot Pro 70.3% self-reported llm-stats link →
Seal-0 47.2% self-reported llm-stats link →
SimpleVQA 56.0% self-reported llm-stats link →
SlakeVQA 80.0% self-reported llm-stats link →
SUNRGBD 35.4% self-reported llm-stats link →
SuperGPQA 65.6% self-reported llm-stats link →
SWE-Bench Verified 72.4% self-reported llm-stats link →
t2-bench 79.0% self-reported llm-stats link →
Terminal-Bench 2.0 41.6% self-reported llm-stats link →
TIR-Bench 59.8% self-reported llm-stats link →
V* 93.7% self-reported llm-stats link →
VideoMME w sub. 87.0% self-reported llm-stats link →
VideoMME w/o sub. 82.8% self-reported llm-stats link →
VideoMMMU 82.3% self-reported llm-stats link →
VITA-Bench 41.9% self-reported llm-stats link →
VLMsAreBlind 96.9% self-reported llm-stats link →
WideSearch 61.1% self-reported llm-stats link →
WMT24++ 77.6% self-reported llm-stats link →
ZEROBench 10.0% self-reported llm-stats link →
ZEROBench-Sub 36.2% self-reported llm-stats link →