Humanity's Last Exam
math official site →
Humanity's Last Exam (HLE) is a multi-modal academic benchmark with 2,500 questions across mathematics, humanities, and natural sciences, designed to test LLM capabilities at the frontier of human knowledge with unambiguous, verifiable solutions
Methodology
Imported from llm-stats public benchmark metadata. Modality: multimodal. Max score: 1. Categories: math, reasoning, vision. Language: en. Verified by llm-stats: no.