AIME
math official site →
American Invitational Mathematics Examination (AIME) benchmark for evaluating mathematical reasoning capabilities of large language models. Contains 30 challenging mathematical problems from AIME 2024 competition that require multi-step reasoning and advanced mathematical insight. Each problem has an integer answer between 000-999.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: math, reasoning. Language: en. Verified by llm-stats: no.