Skip to content

Models Benchmarks Providers

Search models and benchmarks /

Include

general

Categories: general
Modality: text
Language: en
Multilingual: No
Max score: 1
Scoring: %, higher is better
Verified by llm-stats: No

Include benchmark - specific documentation not found in official sources

Leaderboard

Showing 20 of 30 results

Claude Opus 4.8

87.6%

i
Qwen3.7 Max

86.2%

i
Qwen3.5-397B-A17B

85.6%

i
Qwen3.6 Plus

85.1%

i
Qwen3.5-122B-A10B

82.8%

i
Qwen3.5-27B

81.6%

i
Qwen3-235B-A22B-Thinking-2507

81.0%

i
Qwen3 VL 235B A22B Instruct

80.0%

i
Qwen3 VL 235B A22B Thinking

80.0%

i
Qwen3.5-35B-A3B

79.7%

i
Qwen3-235B-A22B-Instruct-2507

79.5%

i
Qwen3-Next-80B-A3B-Instruct

78.9%

i
Qwen3-Next-80B-A3B-Thinking

78.9%

i
Qwen3 VL 32B Thinking

76.3%

i
Qwen3.5-9B

75.6%

i
Qwen3 VL 30B A3B Thinking

74.5%

i
Qwen3 VL 32B Instruct

74.0%

i
Qwen3 235B A22B

73.5%

i
Qwen3 VL 30B A3B Instruct

71.6%

i
Qwen3.5-4B

71.0%

i

Wikibench About Theme Content licensed CC BY-SA 4.0.