Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite is the first Flash-Lite model in the Gemini 3 series. It is optimized for high-volume, latency-sensitive tasks like translation, content moderation, and classification. It delivers enhanced performance at a fraction of the cost of larger models, with 2.5x faster Time to First Answer Token and 45% increased output speed compared to 2.5 Flash. Supports text, image, video, audio, and PDF input with a 1 million-token context window.

Benchmark results

Benchmark Score Tags Source
CharXiv-R 73.2% self-reported llm-stats link →
FACTS Grounding 40.6% self-reported llm-stats link →
GPQA 86.9% self-reported llm-stats link →
Humanity's Last Exam 16.0% self-reported llm-stats link →
MMMLU 88.9% self-reported llm-stats link →
MMMU-Pro 76.8% self-reported llm-stats link →
MRCR v2 (8-needle) 60.1% self-reported llm-stats link →
SimpleQA 43.3% self-reported llm-stats link →
VideoMMMU 84.8% self-reported llm-stats link →