CodeForces
math official site →
A competitive programming benchmark using problems from the CodeForces platform. The benchmark evaluates code generation capabilities of LLMs on algorithmic problems with difficulty ratings ranging from 800 to 2400. Problems cover diverse algorithmic categories including dynamic programming, graph algorithms, data structures, and mathematical problems with standardized evaluation through direct platform submission.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 3000. Categories: math, reasoning. Language: en. Verified by llm-stats: no.