WMT24++

language

WMT24++ is a comprehensive multilingual machine translation benchmark that expands the WMT24 dataset to cover 55 languages and dialects. It includes human-written references and post-edits across four domains (literary, news, social, and speech) to evaluate machine translation systems and large language models across diverse linguistic contexts.

Leaderboard

Showing 20 of 20 results

Nemotron 3 Super (120B A12B)

86.7%

i
Nemotron 3 Nano (30B A3B)

86.2%

i
Qwen3.7 Max

85.8%

i
Qwen3.6 Plus

84.3%

i
Qwen3.5-397B-A17B

78.9%

i
Qwen3.5-122B-A10B

78.3%

i
Qwen3.5-27B

77.6%

i
Qwen3.5-35B-A3B

76.3%

i
Qwen3.5-9B

72.6%

i
Qwen3.5-4B

66.6%

i
Gemma 3 27B

53.4%

i
Gemma 3 12B

51.6%

i
Gemma 3n E4B Instructed

50.1%

i
Gemma 3n E4B Instructed LiteRT Preview

50.1%

i
Gemma 3 4B

46.8%

i
Qwen3.5-2B

45.8%

i
Gemma 3n E2B Instructed

42.7%

i
Gemma 3n E2B Instructed LiteRT (Preview)

42.7%

i
Gemma 3 1B

35.9%

i
Qwen3.5-0.8B

27.2%

i