Llama 3.1 Nemotron Ultra 253B v1
A 253B parameter derivative of Meta Llama 3.1 405B Instruct, developed by NVIDIA using Neural Architecture Search (NAS) and vertical compression. It underwent multi-phase post-training (SFT for Math, Code, Reasoning, Chat, Tool Calling; RL with GRPO) to enhance reasoning and instruction-following. Optimized for accuracy/efficiency tradeoff on NVIDIA GPUs. Supports 128k context.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2025 | 72.5% | self-reported llm-stats | link → |
| BFCL v2 | 74.1% | self-reported llm-stats | link → |
| GPQA | 76.0% | self-reported llm-stats | link → |
| IFEval | 89.5% | self-reported llm-stats | link → |
| LiveCodeBench | 66.3% | self-reported llm-stats | link → |
| MATH-500 | 97.0% | self-reported llm-stats | link → |