Gemini 3.5 Flash
Gemini 3.5 Flash is Google's strongest agentic and coding model in the Flash series, delivering frontier-level performance at up to 4x the speed of comparable frontier models and often at less than half the cost. Built to execute complex, agentic workflows, it outperforms Gemini 3.1 Pro on coding and agentic benchmarks while leading in multimodal understanding. It supports a 1 million-token input context window with 64k output tokens and uses dynamic thinking by default.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| ARC-AGI v2 | 72.1% | self-reported llm-stats | link → |
| Blueprint-Bench 2 | 33.6% | self-reported llm-stats | link → |
| CharXiv-R | 84.2% | self-reported llm-stats | link → |
| Finance Agent | 57.9% | self-reported llm-stats | link → |
| GDPval-AA | 1,656 | self-reported llm-stats | link → |
| Humanity's Last Exam | 40.2% | self-reported llm-stats | link → |
| MCP Atlas | 83.6% | self-reported llm-stats | link → |
| MMMU-Pro | 83.6% | self-reported llm-stats | link → |
| MRCR v2 (8-needle) | 26.6% | self-reported llm-stats | link → |
| OSWorld-Verified | 78.4% | self-reported llm-stats | link → |
| SWE-Bench Pro | 55.1% | self-reported llm-stats | link → |
| Terminal-Bench 2.0 | 76.2% | self-reported llm-stats | link → |
| Toolathlon | 56.5% | self-reported llm-stats | link → |