Multilingual MMLU

reasoning

MMLU-ProX is a comprehensive multilingual benchmark covering 29 typologically diverse languages, building upon MMLU-Pro. Each language version consists of 11,829 identical questions enabling direct cross-linguistic comparisons. The benchmark evaluates large language models' reasoning capabilities across linguistic and cultural boundaries through challenging, reasoning-focused questions with 10 answer choices.

Leaderboard

Showing 5 of 5 results

o3-mini

80.7%

i
Ministral 3 (14B Base 2512)

74.2%

i
Ministral 3 (8B Base 2512)

70.6%

i
Ministral 3 (3B Base 2512)

65.2%

i
Phi 4 Mini

49.3%

i