VCR_en_easy

reasoning official site →

Visual Commonsense Reasoning (VCR) benchmark that tests higher-order cognition and commonsense reasoning beyond simple object recognition. Models must answer challenging questions about images and provide rationales justifying their answers. The benchmark measures the ability to infer people's actions, goals, and mental states from visual context.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: reasoning, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Qwen2-VL-72B-Instruct self-reported llm-stats
    91.9%