VQAv2 (val)

reasoning

VQAv2 is a balanced Visual Question Answering dataset containing open-ended questions about images that require understanding of vision, language, and commonsense knowledge to answer. VQAv2 addresses bias issues from the original VQA dataset by collecting complementary images such that every question is associated with similar images that result in different answers, forcing models to actually understand visual content rather than relying on language priors.

Leaderboard

Showing 3 of 3 results

Gemma 3 12B

71.6%

i
Gemma 3 27B

71.0%

i
Gemma 3 4B

62.4%

i