GDPval-MM

reasoning official site →

GDPval-MM is the multimodal variant of the GDPval benchmark, evaluating AI model performance on real-world economically valuable tasks that require processing and generating multimodal content including documents, slides, diagrams, spreadsheets, images, and other professional deliverables across diverse industries.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: finance, general, multimodal, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. GPT-5.5 self-reported llm-stats
    84.9%
  2. GPT-5.5 Pro self-reported llm-stats
    82.3%
  3. MiniMax M2.5 self-reported llm-stats
    59.0%