EQ-Bench

reasoning official site →

EQ-Bench is an LLM-judged test evaluating active emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills. The test set contains 45 challenging roleplay scenarios, most of which constitute pre-written prompts spanning 3 turns. The benchmark evaluates the performance of models by validating responses against several criteria and conducts pairwise comparisons to report a normalized Elo computation for each model.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 2000. Categories: creativity, general, reasoning, roleplay, writing. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Grok-4.1 Thinking self-reported llm-stats
    1,586
  2. Grok-4.1 self-reported llm-stats
    1,585