PointGrounding
multimodal official site →
PointArena is a comprehensive platform for evaluating multimodal pointing across diverse reasoning scenarios. It includes Point-Bench, a curated dataset of ~1,000 pointing tasks across five categories: Spatial (positional references), Affordance (functional part identification), Counting (attribute-based grouping), Steerable (relative pointing), and Reasoning (open-ended visual inference). The benchmark evaluates language-guided pointing capabilities in vision-language models.
Methodology
Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: grounding, multimodal, spatial_reasoning, vision. Language: en. Verified by llm-stats: no.