Gemini 3.1 Pro
Gemini 3.1 Pro is the latest model in the Gemini 3 series. It excels at complex tasks requiring broad world knowledge and advanced reasoning across modalities. Gemini 3.1 Pro uses dynamic thinking by default to reason through prompts, and features a 1 million-token input context window with 64k output tokens.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| APEX-Agents | 33.5% | self-reported llm-stats | link → |
| ARC-AGI v2 | 77.1% | self-reported llm-stats | link → |
| BrowseComp | 85.9% | self-reported llm-stats | link → |
| GDPval-AA | 1,317 | self-reported llm-stats | link → |
| GPQA | 94.3% | self-reported llm-stats | link → |
| Humanity's Last Exam | 51.4% | self-reported llm-stats | link → |
| LiveCodeBench Pro | 2,887 | self-reported llm-stats | link → |
| MCP Atlas | 69.2% | self-reported llm-stats | link → |
| MMMLU | 92.6% | self-reported llm-stats | link → |
| MMMU-Pro | 80.5% | self-reported llm-stats | link → |
| MRCR v2 (8-needle) | 26.3% | self-reported llm-stats | link → |
| SciCode | 59.0% | self-reported llm-stats | link → |
| SWE-Bench Pro | 54.2% | self-reported llm-stats | link → |
| SWE-Bench Verified | 80.6% | self-reported llm-stats | link → |
| t2-bench | 99.3% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 68.5% | self-reported llm-stats | link → |