MMVet

math

MM-Vet is an evaluation benchmark that examines large multimodal models on complicated multimodal tasks requiring integrated capabilities. It assesses six core vision-language capabilities: recognition, knowledge, spatial awareness, language generation, OCR, and math through questions that require one or more of these capabilities.

Leaderboard

Showing 2 of 2 results

Qwen2.5 VL 72B Instruct

76.2%

i
Qwen2.5 VL 7B Instruct

67.1%

i