MRCR v2

reasoning official site →

MRCR v2 (Multi-Round Coreference Resolution version 2) is an enhanced version of the synthetic long-context reasoning task. It extends the original MRCR framework with improved evaluation criteria and additional complexity for testing models' ability to maintain attention and reasoning across extended contexts.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: general, long_context, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Gemma 4 31B self-reported llm-stats
    66.4%
  2. Gemma 4 26B-A4B self-reported llm-stats
    44.1%
  3. Gemma 4 E4B self-reported llm-stats
    25.4%
  4. Gemma 4 E2B self-reported llm-stats
    19.1%
  5. Gemini 2.5 Flash-Lite self-reported llm-stats
    16.6%