GPT OSS 120B
GPT-OSS-120B is an open-weight, 116.8B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation. It achieves near-parity with OpenAI o4-mini on core reasoning benchmarks. Note: While referred to as '120b' for simplicity, it technically has 116.8B parameters.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| CodeForces | 82.1% | self-reported llm-stats | link → |
| CodeForces | 82.1% | self-reported llm-stats | link → |
| GPQA | 80.1% | self-reported llm-stats | link → |
| HealthBench | 57.6% | self-reported llm-stats | link → |
| HealthBench Hard | 30.0% | self-reported llm-stats | link → |
| Humanity's Last Exam | 14.9% | self-reported llm-stats | link → |
| Humanity's Last Exam | 14.9% | self-reported llm-stats | link → |
| MMLU | 90.0% | self-reported llm-stats | link → |
| TAU-bench Retail | 67.8% | self-reported llm-stats | link → |