SlakeVQA

reasoning official site →

A semantically-labeled knowledge-enhanced dataset for medical visual question answering. Contains 642 radiology images (CT scans, MRI scans, X-rays) covering five body parts and 14,028 bilingual English-Chinese question-answer pairs annotated by experienced physicians. Features comprehensive semantic labels and a structural medical knowledge base with both vision-only and knowledge-based questions requiring external medical knowledge reasoning.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: healthcare, image_to_text, multimodal, reasoning, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Qwen3.5-122B-A10B self-reported llm-stats
    81.6%
  2. Qwen3.5-27B self-reported llm-stats
    80.0%
  3. Qwen3.5-35B-A3B self-reported llm-stats
    78.7%
  4. MedGemma 4B IT self-reported llm-stats
    62.3%