o1-preview

A research preview model focused on mathematical and logical reasoning capabilities, demonstrating improved performance on tasks requiring step-by-step reasoning, mathematical problem-solving, and code generation. The model shows enhanced capabilities in formal reasoning while maintaining strong general capabilities.

MGSM

90.8%

i
MMLU

90.8%

i
MATH

85.5%

i
GPQA

73.3%

i
LiveBench

52.3%

i
SimpleQA

42.4%

i
AIME 2024

42.0%

i
SWE-Bench Verified

41.3%

i