WildClawBench
coding agents official site →
WildClawBench is an agentic coding benchmark from InternLM/Claw-Eval that reports overall model performance on real-world tool-using development tasks.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: agents, coding. Language: en.