Qasper

reasoning official site →

QASPER is a dataset of 5,049 information-seeking questions and answers anchored in 1,585 NLP research papers. Questions are written by NLP practitioners who read only titles and abstracts, while answers require understanding the full paper text and provide supporting evidence. The dataset challenges models with complex reasoning across document sections for academic document question answering. Each question seeks information present in the full text and is answered by a separate set of NLP practitioners who also provide supporting evidence to answers.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: long_context, reasoning. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Phi-3.5-mini-instruct self-reported llm-stats
    41.9%
  2. Phi-3.5-MoE-instruct self-reported llm-stats
    40.0%