Translation Set1→en COMET22

language official site →

COMET-22 is a neural machine translation evaluation metric that uses an ensemble of two models: a COMET estimator trained with Direct Assessments and a multitask model that predicts sentence-level scores and word-level OK/BAD tags. It provides improved correlations with human judgments and increased robustness to critical errors compared to previous metrics.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Nova Pro self-reported llm-stats
    89.0%
  2. Nova Lite self-reported llm-stats
    88.8%
  3. Nova Micro self-reported llm-stats
    88.7%