Gemini 2.5 Flash

A thinking model designed for a balance between price and performance. It builds upon Gemini 2.0 Flash with upgraded reasoning, hybrid thinking control, multimodal capabilities (text, image, video, audio input), and a 1M token input context window.

Benchmark results

Benchmark Score Tags Source
Aider-Polyglot 61.9% self-reported llm-stats link →
Aider-Polyglot Edit 56.7% self-reported llm-stats link →
AIME 2024 88.0% self-reported llm-stats link →
AIME 2025 72.0% self-reported llm-stats link →
FACTS Grounding 85.3% self-reported llm-stats link →
Global-MMLU-Lite 88.4% self-reported llm-stats link →
GPQA 82.8% self-reported llm-stats link →
Humanity's Last Exam 11.0% self-reported llm-stats link →
LiveCodeBench v5 63.9% self-reported llm-stats link →
MMMU 79.7% self-reported llm-stats link →
MRCR 32.0% self-reported llm-stats link →
SimpleQA 26.9% self-reported llm-stats link →
SWE-Bench Verified 60.4% self-reported llm-stats link →
Vibe-Eval 65.4% self-reported llm-stats link →