CRUXEval-Input-CoT

reasoning

CRUXEval input prediction task with Chain of Thought (CoT) prompting. Part of the CRUXEval benchmark for code reasoning, understanding, and execution evaluation. Given a Python function and its expected output, the task is to predict the appropriate input using chain-of-thought reasoning. Consists of 800 Python functions (3-13 lines) designed to evaluate code comprehension and reasoning capabilities.

Leaderboard

Showing 1 of 1 result

Qwen2.5-Coder 7B Instruct

56.5%

i