LiveCodeBench v5

reasoning

LiveCodeBench is a holistic and contamination-free evaluation benchmark for large language models for code. It continuously collects new problems from programming contests (LeetCode, AtCoder, CodeForces) and evaluates four different scenarios: code generation, self-repair, code execution, and test output prediction. Problems are annotated with release dates to enable evaluation on unseen problems released after a model's training cutoff.

Leaderboard

Showing 9 of 9 results

Gemini 2.5 Pro

75.6%

i
Gemini 2.5 Flash

63.9%

i
Qwen3 VL 235B A22B Instruct

61.4%

i
MiniCPM-SALA

60.5%

i
Gemini 2.0 Flash-Lite

28.9%

i
Gemma 3n E4B Instructed

25.7%

i
Gemma 3n E4B Instructed LiteRT Preview

25.7%

i
Gemma 3n E2B Instructed

18.6%

i
Gemma 3n E2B Instructed LiteRT (Preview)

18.6%

i