Gemini 3.1 Flash-Lite
Gemini 3.1 Flash-Lite is the first Flash-Lite model in the Gemini 3 series. It is optimized for high-volume, latency-sensitive tasks like translation, content moderation, and classification. It delivers enhanced performance at a fraction of the cost of larger models, with 2.5x faster Time to First Answer Token and 45% increased output speed compared to 2.5 Flash. Supports text, image, video, audio, and PDF input with a 1 million-token context window.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| CharXiv-R | 73.2% | self-reported llm-stats | link → |
| FACTS Grounding | 40.6% | self-reported llm-stats | link → |
| GPQA | 86.9% | self-reported llm-stats | link → |
| Humanity's Last Exam | 16.0% | self-reported llm-stats | link → |
| MMMLU | 88.9% | self-reported llm-stats | link → |
| MMMU-Pro | 76.8% | self-reported llm-stats | link → |
| MRCR v2 (8-needle) | 60.1% | self-reported llm-stats | link → |
| SimpleQA | 43.3% | self-reported llm-stats | link → |
| VideoMMMU | 84.8% | self-reported llm-stats | link → |