MEGA MLQA

reasoning official site →

MLQA as part of the MEGA (Multilingual Evaluation of Generative AI) benchmark suite. A multi-way aligned extractive QA evaluation benchmark for cross-lingual question answering across 7 languages (English, Arabic, German, Spanish, Hindi, Vietnamese, and Simplified Chinese) with over 12K QA instances in English and 5K in each other language.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Phi-3.5-MoE-instruct self-reported llm-stats
    65.3%
  2. Phi-3.5-mini-instruct self-reported llm-stats
    61.7%