Nemotron 3 Super (120B A12B)
Nemotron 3 Super is a 120B total / 12B active parameter hybrid Mamba-Attention Mixture-of-Experts model optimized for agentic reasoning, coding, planning, tool calling, and long-context analysis. It introduces LatentMoE (projecting tokens into a compressed latent space for expert routing, enabling 4x more experts at the same inference cost), Multi-Token Prediction for native speculative decoding (up to 3x faster generation), and native NVFP4 pretraining on Blackwell. The hybrid architecture interleaves Mamba-2 layers for linear-time sequence processing with strategically placed Transformer attention layers as global anchors, supporting a 1M-token context window. Pre-trained on 25 trillion tokens and post-trained with multi-environment RL across 21 configurations using NeMo Gym/RL with 1.2 million rollouts. Achieves up to 5x higher throughput than previous Nemotron Super and 2.2x higher throughput than GPT-OSS-120B while maintaining comparable accuracy.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AA-LCR | 58.3% | self-reported llm-stats | link → |
| AIME 2025 | 90.2% | self-reported llm-stats | link → |
| Arena-Hard v2 | 73.9% | self-reported llm-stats | link → |
| Bird-SQL (dev) | 41.8% | self-reported llm-stats | link → |
| BrowseComp | 31.3% | self-reported llm-stats | link → |
| GPQA | 82.7% | self-reported llm-stats | link → |
| HMMT 2025 | 94.7% | self-reported llm-stats | link → |
| Humanity's Last Exam | 22.8% | self-reported llm-stats | link → |
| IFBench | 72.6% | self-reported llm-stats | link → |
| LiveCodeBench | 81.2% | self-reported llm-stats | link → |
| MMLU-Pro | 83.7% | self-reported llm-stats | link → |
| MMLU-ProX | 79.4% | self-reported llm-stats | link → |
| Multi-Challenge | 55.2% | self-reported llm-stats | link → |
| RULER | 91.8% | self-reported llm-stats | link → |
| SciCode | 42.0% | self-reported llm-stats | link → |
| SWE-bench Multilingual | 45.8% | self-reported llm-stats | link → |
| SWE-Bench Verified | 53.7% | self-reported llm-stats | link → |
| Tau2 Airline | 56.3% | self-reported llm-stats | link → |
| Tau2 Retail | 62.8% | self-reported llm-stats | link → |
| Tau2 Telecom | 64.4% | self-reported llm-stats | link → |
| Terminal-Bench | 25.8% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 31.0% | self-reported llm-stats | link → |
| WMT24++ | 86.7% | self-reported llm-stats | link → |