Nova 2 Omni

Amazon Nova 2 Omni is Amazon's first unified multimodal reasoning model that processes text, documents, images, video, and audio inputs and generates both text and images from a single model, eliminating multi-model coordination complexity. It delivers strong multimodal perception, core reasoning, agentic tool use, and high-quality image generation and editing, with configurable extended thinking.

AIME 2025

92.1%

i
RefCOCOg

86.3%

i
ScreenSpot

85.4%

i
MMLU-Pro

80.7%

i
Tau2 Telecom

80.0%

i
Tau2 Retail

78.3%

i
Video-MME

77.9%

i
QVHighlights

76.7%

i
Multi-Challenge

75.5%

i
MMAU

75.3%

i
Tau2 Airline

68.8%

i
IFBench

68.7%

i
MAVERIX

66.6%

i
MMMU-Pro

61.4%

i
RealKIE-FCC

59.8%

i
BFCL-V4

58.3%

i
OCRBench_V2

58.2%

i
CoVoST2

40.7%

i