Big Bench Audio

reasoning official site →

Big Bench Audio is an audio reasoning benchmark adapted from a subset of Big Bench Hard, with text questions converted to spoken audio. It evaluates the reasoning ability of speech-to-speech and audio language models on tasks delivered as audio input, with accuracy scored by an independent evaluation (Artificial Analysis).

Methodology

Imported from llm-stats public benchmark metadata. Modality: audio. Max score: 1. Categories: audio, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Nova 2 Sonic self-reported llm-stats
    87.0%