CoVoST2

audio official site →

CoVoST 2 is a large-scale multilingual speech translation corpus derived from Common Voice, covering translations from 21 languages into English and from English into 15 languages. The dataset contains 2,880 hours of speech with 78K speakers for speech translation research.

Methodology

Imported from llm-stats public benchmark metadata. Modality: audio. Max score: 1. Categories: audio, language, speech_to_text. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Nova 2 Omni self-reported llm-stats
    40.7%
  2. Gemini 2.0 Flash self-reported llm-stats
    39.2%
  3. Gemini 2.0 Flash-Lite self-reported llm-stats
    38.4%