BoolQ
reasoning official site →
BoolQ is a reading comprehension dataset for yes/no questions containing 15,942 naturally occurring examples. Each example consists of a question, passage, and boolean answer, where questions are generated in unprompted and unconstrained settings. The dataset challenges models with complex, non-factoid information requiring entailment-like inference to solve.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, reasoning. Language: en. Verified by llm-stats: no.