VoiceBench Avg

reasoning

VoiceBench is the first benchmark designed to provide a multi-faceted evaluation of LLM-based voice assistants, evaluating capabilities including general knowledge, instruction-following, reasoning, and safety using both synthetic and real spoken instruction data with diverse speaker characteristics and environmental conditions.

Leaderboard

Showing 1 of 1 result

Qwen2.5-Omni-7B

74.1%

i