OmniMath
math official site →
A Universal Olympiad Level Mathematic Benchmark for Large Language Models containing 4,428 competition-level problems with rigorous human annotation, categorized into over 33 sub-domains and spanning more than 10 distinct difficulty levels
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: math, reasoning. Language: en. Verified by llm-stats: no.