CC-OCR

multimodal official site →

A comprehensive OCR benchmark for evaluating Large Multimodal Models (LMMs) in literacy. Comprises four OCR-centric tracks: multi-scene text reading, multilingual text reading, document parsing, and key information extraction. Contains 39 subsets with 7,058 fully annotated images, 41% sourced from real applications. Tests capabilities including text grounding, multi-orientation text recognition, and detecting hallucination/repetition across diverse visual challenges.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: multimodal, structured_output, text-to-image, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Qwen3.6 Plus self-reported llm-stats
    83.4%
  2. Qwen3 VL 235B A22B Instruct self-reported llm-stats
    82.2%
  3. Qwen3.5-122B-A10B self-reported llm-stats
    81.8%
  4. Qwen3 VL 235B A22B Thinking self-reported llm-stats
    81.5%
  5. Qwen3.6-27B self-reported llm-stats
    81.2%
  6. Qwen3.5-27B self-reported llm-stats
    81.0%
  7. Qwen3 VL 30B A3B Instruct self-reported llm-stats
    80.7%
  8. Qwen3.5-35B-A3B self-reported llm-stats
    80.7%
  9. Qwen3 VL 32B Instruct self-reported llm-stats
    80.3%
  10. Qwen3 VL 8B Instruct self-reported llm-stats
    79.9%
  11. Qwen2.5 VL 72B Instruct self-reported llm-stats
    79.8%
  12. Qwen2.5 VL 7B Instruct self-reported llm-stats
    77.8%
  13. Qwen3 VL 30B A3B Thinking self-reported llm-stats
    77.8%
  14. Qwen2.5 VL 32B Instruct self-reported llm-stats
    77.1%
  15. Qwen3 VL 8B Thinking self-reported llm-stats
    76.3%
  16. Qwen3 VL 4B Instruct self-reported llm-stats
    76.2%
  17. Qwen3 VL 4B Thinking self-reported llm-stats
    73.8%