MMVetGPT4Turbo
math official site →
MM-Vet evaluation using GPT-4 Turbo for scoring. This variant of MM-Vet examines large multimodal models on complicated multimodal tasks requiring integrated capabilities across six core vision-language abilities: recognition, knowledge, spatial awareness, language generation, OCR, and math.
Methodology
Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: general, math, multimodal, reasoning, spatial_reasoning, vision. Language: en. Verified by llm-stats: no.