MEGA TyDi QA

reasoning official site →

TyDi QA as part of the MEGA benchmark suite. A question answering dataset covering 11 typologically diverse languages (Arabic, Bengali, English, Finnish, Indonesian, Japanese, Korean, Russian, Swahili, Telugu, and Thai) with 204K question-answer pairs. Features realistic information-seeking questions written by people who want to know the answer but don't know it yet.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Phi-3.5-MoE-instruct self-reported llm-stats
    67.1%
  2. Phi-3.5-mini-instruct self-reported llm-stats
    62.2%