Gemma 4 31B

Gemma 4 31B is Google DeepMind's flagship dense multimodal model with 31 billion parameters and a 256K context window. Ranks #3 among open models on Arena AI. Built from the same research as Gemini 3, it features Per-Layer Embeddings, Shared KV Cache, alternating sliding-window and global attention, and variable aspect ratio vision encoding. Achieves an estimated LMArena text score of 1452.

Benchmark results

Benchmark Score Tags Source
AIME 2026 89.2% self-reported llm-stats link →
BIG-Bench Extra Hard 74.4% self-reported llm-stats link →
GPQA 84.3% self-reported llm-stats link →
Humanity's Last Exam 26.5% self-reported llm-stats link →
LiveCodeBench v6 80.0% self-reported llm-stats link →
MathVision 85.6% self-reported llm-stats link →
MedXpertQA 61.3% self-reported llm-stats link →
MMLU-Pro 85.2% self-reported llm-stats link →
MMMLU 88.4% self-reported llm-stats link →
MMMU-Pro 76.9% self-reported llm-stats link →
MRCR v2 66.4% self-reported llm-stats link →
t2-bench 86.4% self-reported llm-stats link →