GPT-4.1 nano

GPT-4.1 nano is OpenAI's fastest and cheapest model available in the GPT-4.1 family. It delivers exceptional performance at a small size with its 1 million token context window. Ideal for tasks like classification or autocompletion.

Benchmark results

Benchmark Score Tags Source
Aider-Polyglot 9.8% self-reported llm-stats link →
Aider-Polyglot Edit 6.2% self-reported llm-stats link →
AIME 2024 29.4% self-reported llm-stats link →
CharXiv-D 73.9% self-reported llm-stats link →
CharXiv-R 40.5% self-reported llm-stats link →
COLLIE 42.5% self-reported llm-stats link →
ComplexFuncBench 5.7% self-reported llm-stats link →
GPQA 50.3% self-reported llm-stats link →
Graphwalks BFS <128k 25.0% self-reported llm-stats link →
Graphwalks BFS >128k 2.9% self-reported llm-stats link →
Graphwalks parents <128k 9.4% self-reported llm-stats link →
Graphwalks parents >128k 5.6% self-reported llm-stats link →
IFEval 74.5% self-reported llm-stats link →
Internal API instruction following (hard) 31.6% self-reported llm-stats link →
MathVista 56.2% self-reported llm-stats link →
MMLU 80.1% self-reported llm-stats link →
MMMLU 66.9% self-reported llm-stats link →
MMMU 55.4% self-reported llm-stats link →
Multi-Challenge 15.0% self-reported llm-stats link →
Multi-IF 57.2% self-reported llm-stats link →
MultiChallenge (o3-mini grader) 31.1% self-reported llm-stats link →
OpenAI-MRCR: 2 needle 128k 36.6% self-reported llm-stats link →
OpenAI-MRCR: 2 needle 1M 12.0% self-reported llm-stats link →
TAU-bench Airline 14.0% self-reported llm-stats link →
TAU-bench Retail 22.6% self-reported llm-stats link →