CSimpleQA

language

Chinese SimpleQA is the first comprehensive Chinese benchmark to evaluate the factuality ability of language models to answer short questions. It contains 3,000 high-quality questions spanning 6 major topics with 99 diverse subtopics, designed to assess Chinese factual knowledge across humanities, science, engineering, culture, and society.

Leaderboard

Showing 7 of 7 results

DeepSeek-V4-Pro-Max

84.4%

i
Qwen3-235B-A22B-Instruct-2507

84.3%

i
Qwen3 VL 235B A22B Instruct

83.4%

i
DeepSeek-V4-Flash-Max

78.9%

i
Kimi K2 Instruct

78.4%

i
Kimi K2 Base

77.6%

i
DeepSeek-V3

64.8%

i