MRCR

reasoning official site →

MRCR (Multi-Round Coreference Resolution) is a synthetic long-context reasoning task where models must navigate long conversations to reproduce specific model outputs. It tests the ability to distinguish between similar requests and reason about ordering while maintaining attention across extended contexts.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: general, long_context, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Gemini 2.5 Pro self-reported llm-stats
    93.0%
  2. Gemini 1.5 Pro self-reported llm-stats
    82.6%
  3. Gemini 1.5 Flash self-reported llm-stats
    71.9%
  4. Gemini 2.0 Flash self-reported llm-stats
    69.2%
  5. Gemini 1.5 Flash 8B self-reported llm-stats
    54.7%
  6. MiMo-V2-Flash self-reported llm-stats
    45.7%
  7. Gemini 2.5 Flash self-reported llm-stats
    32.0%