LMArena Text Leaderboard
reasoning official site →
LMArena Text Leaderboard is a blind human preference evaluation benchmark that ranks models based on pairwise comparisons in real-world conversations. The leaderboard uses Elo ratings computed from user preferences in head-to-head model battles, providing a comprehensive measure of overall model capability and style.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 2000. Categories: general, reasoning. Language: en. Verified by llm-stats: no.