SWE-Dev

frontend development official site →

SWE-bench development split consisting of 225 software engineering problems drawn from real GitHub issues across 12 popular Python repositories. Language models are given a codebase along with a description of an issue to be resolved and must edit the codebase to address the issue, often requiring understanding and coordinating changes across multiple functions, classes, and files.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: frontend_development. Language: en. Verified by llm-stats: no.

Leaderboard

No results yet.