OmniDocBench

multimodal document understandingvision

OmniDocBench evaluates multimodal models on document understanding tasks such as OCR, layout parsing, and structured document comprehension.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: document_understanding, multimodal, vision. Language: en.

Leaderboard

  1. 87.2%