LiveBench 20241125
math official site →
LiveBench is a challenging, contamination-limited LLM benchmark that addresses test set contamination by releasing new questions monthly based on recently-released datasets, arXiv papers, news articles, and IMDb movie synopses. It comprises tasks across math, coding, reasoning, language, instruction following, and data analysis with verifiable, objective ground-truth answers.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: general, math, reasoning. Language: en. Verified by llm-stats: no.