GDPval-Rubrics

reasoning

GDPval-Rubrics evaluates AI model performance on economically valuable knowledge work tasks drawn from the public GDPval dataset. It uses pointwise scoring based on public rubrics, with the environment aligned to the GDPval-AA scaffolding.

Leaderboard

Showing 1 of 1 result

MiniMax M3

74.8%

i