ImageMining

multimodal

ImageMining evaluates multimodal models on extracting structured information from images using tool use, measuring ability to combine visual understanding with tool-based retrieval and analysis.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: agents, multimodal, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. GLM-5V-Turbo self-reported llm-stats
    30.7%