PathMCQA

reasoning official site →

PathMMU is a massive multimodal expert-level benchmark for understanding and reasoning in pathology, containing 33,428 multimodal multi-choice questions and 24,067 images validated by seven pathologists. It evaluates Large Multimodal Models (LMMs) performance on pathology tasks, with the top-performing model GPT-4V achieving only 49.8% zero-shot performance compared to 71.8% for human pathologists.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: healthcare, multimodal, reasoning, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. MedGemma 4B IT self-reported llm-stats
    69.8%