Llama 3.1 Nemotron Ultra 253B v1

A 253B parameter derivative of Meta Llama 3.1 405B Instruct, developed by NVIDIA using Neural Architecture Search (NAS) and vertical compression. It underwent multi-phase post-training (SFT for Math, Code, Reasoning, Chat, Tool Calling; RL with GRPO) to enhance reasoning and instruction-following. Optimized for accuracy/efficiency tradeoff on NVIDIA GPUs. Supports 128k context.

Benchmark results

Benchmark Score Tags Source
AIME 2025 72.5% self-reported llm-stats link →
BFCL v2 74.1% self-reported llm-stats link →
GPQA 76.0% self-reported llm-stats link →
IFEval 89.5% self-reported llm-stats link →
LiveCodeBench 66.3% self-reported llm-stats link →
MATH-500 97.0% self-reported llm-stats link →