Gemini 3 Flash
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost. It combines Gemini 3's Pro-grade reasoning with Flash-level latency, efficiency and cost. Features a 1 million-token input context window and is optimized for agentic workflows, coding, and complex analysis.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2025 | 99.7% | self-reported llm-stats | link → |
| ARC-AGI v2 | 33.6% | self-reported llm-stats | link → |
| CharXiv-R | 80.3% | self-reported llm-stats | link → |
| FACTS Grounding | 61.9% | self-reported llm-stats | link → |
| Global PIQA | 92.8% | self-reported llm-stats | link → |
| GPQA | 90.4% | self-reported llm-stats | link → |
| Humanity's Last Exam | 43.5% | self-reported llm-stats | link → |
| LiveCodeBench Pro | 2,316 | self-reported llm-stats | link → |
| MCP Atlas | 57.4% | self-reported llm-stats | link → |
| MMMLU | 91.8% | self-reported llm-stats | link → |
| MMMU-Pro | 81.2% | self-reported llm-stats | link → |
| MRCR v2 (8-needle) | 22.1% | self-reported llm-stats | link → |
| OmniDocBench 1.5 | 12.1% | self-reported llm-stats | link → |
| ScreenSpot Pro | 69.1% | self-reported llm-stats | link → |
| SimpleQA | 68.7% | self-reported llm-stats | link → |
| SWE-Bench Verified | 78.0% | self-reported llm-stats | link → |
| t2-bench | 90.2% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 47.6% | self-reported llm-stats | link → |
| Toolathlon | 49.4% | self-reported llm-stats | link → |
| Vending-Bench 2 | 3,635 | self-reported llm-stats | link → |
| VideoMMMU | 86.9% | self-reported llm-stats | link → |