Humanity's Last Exam

math

Humanity's Last Exam (HLE) is a multi-modal academic benchmark with 2,500 questions across mathematics, humanities, and natural sciences, designed to test LLM capabilities at the frontier of human knowledge with unambiguous, verifiable solutions

Leaderboard

Showing 20 of 82 results

Claude Mythos Preview

64.7%

i
Claude Fable 5

64.5%

i
Muse Spark

58.4%

i
Claude Opus 4.8

57.9%

i
GPT-5.5 Pro

57.2%

i
Claude Opus 4.7

54.7%

i
Claude Opus 4.6

53.1%

i
GLM-5.1

52.3%

i
GPT-5.5

52.2%

i
Gemini 3.1 Pro

51.4%

i
Kimi K2-Thinking-0905

51.0%

i
Grok-4 Heavy

50.7%

i
Kimi K2.5

50.2%

i
Claude Sonnet 4.6

49.0%

i
Qwen3.5-27B

48.5%

i
DeepSeek-V4-Pro-Max

48.2%

i
Qwen3.5-122B-A10B

47.5%

i
Qwen3.5-35B-A3B

47.4%

i
Gemini 3 Pro

45.8%

i
DeepSeek-V4-Flash-Max

45.1%

i