NL2Repo

coding

NL2Repo evaluates long-horizon coding capabilities including repository-level understanding, where models must generate or modify code across entire repositories from natural language specifications.

Leaderboard

Showing 7 of 7 results

Qwen3.7 Max

47.2%

i
GLM-5.1

42.7%

i
MiniMax M3

42.1%

i
MiniMax M2.7

39.8%

i
Qwen3.6 Plus

37.9%

i
Qwen3.6-27B

36.2%

i
Qwen3.6-35B-A3B

29.4%

i