MMAU

reasoning official site →

A massive multi-task audio understanding and reasoning benchmark comprising 10,000 carefully curated audio clips paired with human-annotated natural language questions spanning speech, environmental sounds, and music. Requires expert-level knowledge and complex reasoning across 27 distinct skills.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: audio, multimodal, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Nova 2 Omni self-reported llm-stats
    75.3%
  2. Qwen2.5-Omni-7B self-reported llm-stats
    65.6%