TriviaQA

reasoning

A large-scale reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents (six per question on average) that provide high quality distant supervision for answering the questions. The dataset features relatively complex, compositional questions with considerable syntactic and lexical variability, requiring cross-sentence reasoning to find answers.

Leaderboard

Showing 18 of 18 results

Kimi K2 Base

85.1%

i
Gemma 2 27B

83.7%

i
MiMo-V2.5-Pro

81.3%

i
Mistral Small 3.1 24B Base

80.5%

i
Mistral Small 3.1 24B Instruct

80.5%

i
Mistral Small 3 24B Base

80.3%

i
Granite 3.3 8B Base

78.2%

i
Gemma 2 9B

76.6%

i
Ministral 3 (14B Base 2512)

74.9%

i
Mistral Large 3

74.9%

i
Mistral NeMo Instruct

73.8%

i
Gemma 3n E4B

70.2%

i
Gemma 3n E4B Instructed LiteRT Preview

70.2%

i
Ministral 3 (8B Base 2512)

68.1%

i
Ministral 8B Instruct

65.5%

i
Gemma 3n E2B

60.8%

i
Gemma 3n E2B Instructed LiteRT (Preview)

60.8%

i
Ministral 3 (3B Base 2512)

59.2%

i