ScienceQA Visual

reasoning official site →

ScienceQA Visual is a multimodal science question answering benchmark consisting of 21,208 multiple-choice questions from elementary and high school science curricula. The dataset covers 3 subjects (natural science, language science, social science), 26 topics, 127 categories, and 379 skills. 48.7% of questions include image context requiring multimodal reasoning. Questions are annotated with lectures (83.9%) and explanations (90.5%) to support chain-of-thought reasoning for science question answering.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: multimodal, reasoning, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Phi-4-multimodal-instruct self-reported llm-stats
    97.5%