Gemini 2.5 Flash
A thinking model designed for a balance between price and performance. It builds upon Gemini 2.0 Flash with upgraded reasoning, hybrid thinking control, multimodal capabilities (text, image, video, audio input), and a 1M token input context window.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| Aider-Polyglot | 61.9% | self-reported llm-stats | link → |
| Aider-Polyglot Edit | 56.7% | self-reported llm-stats | link → |
| AIME 2024 | 88.0% | self-reported llm-stats | link → |
| AIME 2025 | 72.0% | self-reported llm-stats | link → |
| FACTS Grounding | 85.3% | self-reported llm-stats | link → |
| Global-MMLU-Lite | 88.4% | self-reported llm-stats | link → |
| GPQA | 82.8% | self-reported llm-stats | link → |
| Humanity's Last Exam | 11.0% | self-reported llm-stats | link → |
| LiveCodeBench v5 | 63.9% | self-reported llm-stats | link → |
| MMMU | 79.7% | self-reported llm-stats | link → |
| MRCR | 32.0% | self-reported llm-stats | link → |
| SimpleQA | 26.9% | self-reported llm-stats | link → |
| SWE-Bench Verified | 60.4% | self-reported llm-stats | link → |
| Vibe-Eval | 65.4% | self-reported llm-stats | link → |