Gemini 1.5 Pro

Gemini 1.5 Pro is a mid-size multimodal model optimized for a wide range of reasoning tasks. It can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text.

Benchmark results

Benchmark Score Tags Source
AMC_2022_23 46.4% self-reported llm-stats link →
BIG-Bench Hard 89.2% self-reported llm-stats link →
DROP 74.9% self-reported llm-stats link →
FLEURS 6.7% self-reported llm-stats link →
FunctionalMATH 64.6% self-reported llm-stats link →
GPQA 59.1% self-reported llm-stats link →
GSM8k 90.8% self-reported llm-stats link →
HellaSwag 93.3% self-reported llm-stats link →
HiddenMath 52.0% self-reported llm-stats link →
HumanEval 84.1% self-reported llm-stats link →
MATH 86.5% self-reported llm-stats link →
MathVista 68.1% self-reported llm-stats link →
MGSM 87.5% self-reported llm-stats link →
MMLU 85.9% self-reported llm-stats link →
MMLU-Pro 75.8% self-reported llm-stats link →
MMMU 65.9% self-reported llm-stats link →
MRCR 82.6% self-reported llm-stats link →
Natural2Code 85.4% self-reported llm-stats link →
PhysicsFinals 63.9% self-reported llm-stats link →
Vibe-Eval 53.9% self-reported llm-stats link →
Video-MME 78.6% self-reported llm-stats link →
WMT23 75.1% self-reported llm-stats link →
XSTest 98.8% self-reported llm-stats link →