WebVoyager

vision

WebVoyager evaluates an agent's ability to navigate and complete tasks on real websites by perceiving page screenshots and executing browser actions.

Methodology

Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: agents, vision. Language: en. Verified by llm-stats: no.

Leaderboard

  1. GLM-5V-Turbo self-reported llm-stats
    88.5%