Gemini 2.5 Pro
A highly capable AI model from Google, designed for the agentic era. Gemini 2.5 Pro performs well on common benchmarks with enhanced reasoning, multimodal capabilities (text, image, video, audio input), and a 1M token context window.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| Aider-Polyglot | 76.5% | self-reported llm-stats | link → |
| Aider-Polyglot Edit | 72.7% | self-reported llm-stats | link → |
| AIME 2024 | 92.0% | self-reported llm-stats | link → |
| AIME 2025 | 83.0% | self-reported llm-stats | link → |
| ARC-AGI v2 | 4.9% | self-reported llm-stats | link → |
| Global-MMLU-Lite | 88.6% | self-reported llm-stats | link → |
| GPQA | 83.0% | self-reported llm-stats | link → |
| Humanity's Last Exam | 17.8% | self-reported llm-stats | link → |
| LiveCodeBench v5 | 75.6% | self-reported llm-stats | link → |
| MMMU | 79.6% | self-reported llm-stats | link → |
| MRCR | 93.0% | self-reported llm-stats | link → |
| MRCR 1M (pointwise) | 82.9% | self-reported llm-stats | link → |
| SimpleQA | 50.8% | self-reported llm-stats | link → |
| SWE-Bench Verified | 63.2% | self-reported llm-stats | link → |
| Vibe-Eval | 65.6% | self-reported llm-stats | link → |
| Video-MME | 84.8% | self-reported llm-stats | link → |