Llama 3.1 Nemotron Ultra 253B v1

A 253B parameter derivative of Meta Llama 3.1 405B Instruct, developed by NVIDIA using Neural Architecture Search (NAS) and vertical compression. It underwent multi-phase post-training (SFT for Math, Code, Reasoning, Chat, Tool Calling; RL with GRPO) to enhance reasoning and instruction-following.

MATH-500

97.0%

i
IFEval

89.5%

i
GPQA

76.0%

i
BFCL v2

74.1%

i
AIME 2025

72.5%

i
LiveCodeBench

66.3%

i