CLUEWSC

reasoning official site →

CLUEWSC2020 is the Chinese version of the Winograd Schema Challenge, part of the CLUE benchmark. It focuses on pronoun disambiguation and coreference resolution, requiring models to determine which noun a pronoun refers to in a sentence. The dataset contains 1,244 training samples and 304 development samples extracted from contemporary Chinese literature.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Kimi-k1.5 self-reported llm-stats
    91.4%
  2. DeepSeek-V3 self-reported llm-stats
    90.9%
  3. ERNIE 4.5 self-reported llm-stats
    48.6%