Gemini 1.5 Pro
Gemini 1.5 Pro is a mid-size multimodal model optimized for a wide range of reasoning tasks. It can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AMC_2022_23 | 46.4% | self-reported llm-stats | link → |
| BIG-Bench Hard | 89.2% | self-reported llm-stats | link → |
| DROP | 74.9% | self-reported llm-stats | link → |
| FLEURS | 6.7% | self-reported llm-stats | link → |
| FunctionalMATH | 64.6% | self-reported llm-stats | link → |
| GPQA | 59.1% | self-reported llm-stats | link → |
| GSM8k | 90.8% | self-reported llm-stats | link → |
| HellaSwag | 93.3% | self-reported llm-stats | link → |
| HiddenMath | 52.0% | self-reported llm-stats | link → |
| HumanEval | 84.1% | self-reported llm-stats | link → |
| MATH | 86.5% | self-reported llm-stats | link → |
| MathVista | 68.1% | self-reported llm-stats | link → |
| MGSM | 87.5% | self-reported llm-stats | link → |
| MMLU | 85.9% | self-reported llm-stats | link → |
| MMLU-Pro | 75.8% | self-reported llm-stats | link → |
| MMMU | 65.9% | self-reported llm-stats | link → |
| MRCR | 82.6% | self-reported llm-stats | link → |
| Natural2Code | 85.4% | self-reported llm-stats | link → |
| PhysicsFinals | 63.9% | self-reported llm-stats | link → |
| Vibe-Eval | 53.9% | self-reported llm-stats | link → |
| Video-MME | 78.6% | self-reported llm-stats | link → |
| WMT23 | 75.1% | self-reported llm-stats | link → |
| XSTest | 98.8% | self-reported llm-stats | link → |