LiveBench

math

LiveBench is a challenging, contamination-limited LLM benchmark that addresses test set contamination by releasing new questions monthly based on recently-released datasets, arXiv papers, news articles, and IMDb movie synopses. It comprises tasks across math, coding, reasoning, language, instruction following, and data analysis with verifiable, objective ground-truth answers.

Leaderboard

Showing 13 of 13 results

o3-mini

84.6%

i
Qwen3 235B A22B

77.1%

i
Kimi K2 Instruct

76.4%

i
Kimi K2-Instruct-0905

76.4%

i
Qwen3 32B

74.9%

i
Qwen3 30B A3B

74.3%

i
QwQ-32B

73.1%

i
o1

67.0%

i
o1-preview

52.3%

i
Qwen2.5 72B Instruct

52.3%

i
Phi 4

47.6%

i
Qwen2.5 7B Instruct

35.9%

i
Qwen2.5-Omni-7B

29.6%

i