GPT-5.1

The best model for coding and agentic tasks with configurable reasoning effort. GPT-5.1 is our flagship model for coding and agentic tasks with configurable reasoning and non-reasoning effort.

Tau2 Telecom

95.6%

i
AIME 2025

94.0%

i
BrowseComp Long Context 128k

90.0%

i
GPQA

88.1%

i
MMMU

85.4%

i
Tau2 Retail

77.9%

i
SWE-Bench Verified

76.3%

i
Tau2 Airline

67.0%

i
FrontierMath

26.7%

i