WMT23

language official site →

The Eighth Conference on Machine Translation (WMT23) benchmark evaluating machine translation systems across 8 language pairs (14 translation directions) including general, biomedical, literary, and low-resource language translation tasks. Features specialized shared tasks for quality estimation, metrics evaluation, sign language translation, and discourse-level literary translation with professional human assessment.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: healthcare, language. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Gemini 1.5 Pro self-reported llm-stats
    75.1%
  2. Gemini 1.5 Flash self-reported llm-stats
    74.1%
  3. Gemini 1.5 Flash 8B self-reported llm-stats
    72.6%
  4. Gemini 1.0 Pro self-reported llm-stats
    71.7%