ARC-C

reasoning

The AI2 Reasoning Challenge (ARC) Challenge Set is a multiple-choice question-answering benchmark containing grade-school level science questions that require advanced reasoning capabilities. ARC-C specifically contains questions that were answered incorrectly by both retrieval-based and word co-occurrence algorithms, making it a particularly challenging subset designed to test commonsense reasoning abilities in AI systems.

Leaderboard

Showing 20 of 34 results

MiMo-V2.5-Pro

97.2%

i
Llama 3.1 405B Instruct

96.9%

i
Claude 3 Opus

96.4%

i
Nova Pro

94.8%

i
Llama 3.1 70B Instruct

94.8%

i
Claude 3 Sonnet

93.2%

i
Jamba 1.5 Large

93.0%

i
Nova Lite

92.4%

i
Mistral Small 3 24B Base

91.3%

i
Phi-3.5-MoE-instruct

91.0%

i
Nova Micro

90.2%

i
Claude 3 Haiku

89.2%

i
Jamba 1.5 Mini

85.7%

i
Phi-3.5-mini-instruct

84.6%

i
Phi 4 Mini

83.7%

i
Llama 3.1 8B Instruct

83.4%

i
Llama 3.2 3B Instruct

78.6%

i
Ministral 8B Instruct

71.9%

i
Gemma 2 27B

71.4%

i
Command R+

71.0%

i