LiveCodeBench
coding official site →
LiveCodeBench is a holistic and contamination-free evaluation benchmark for large language models for code. It continuously collects new problems from programming contests (LeetCode, AtCoder, CodeForces) and evaluates four different scenarios: code generation, self-repair, code execution, and test output prediction. Problems are annotated with release dates to enable evaluation on unseen problems released after a model's training cutoff.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: code, general, reasoning. Language: en. Verified by llm-stats: no.