SQuALITY

long context official site →

SQuALITY (Summarization-format QUestion Answering with Long Input Texts, Yes!) is a long-document summarization dataset built by hiring highly-qualified contractors to read public-domain short stories (3000-6000 words) and write original summaries from scratch. Each document has five summaries: one overview and four question-focused summaries. Designed to address limitations in existing summarization datasets by providing high-quality, faithful summaries.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, long_context, summarization. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Phi-3.5-mini-instruct self-reported llm-stats
    24.3%
  2. Phi-3.5-MoE-instruct self-reported llm-stats
    24.1%
  3. Nova Pro self-reported llm-stats
    19.8%
  4. Nova Lite self-reported llm-stats
    19.2%
  5. Nova Micro self-reported llm-stats
    18.8%