Pixtral Large
A 124B parameter multimodal model built on top of Mistral Large 2, featuring frontier-level image understanding capabilities. Excels at understanding documents, charts, and natural images while maintaining strong text-only performance. Features a 123B multimodal decoder and 1B parameter vision encoder with a 128K context window supporting up to 30 high-resolution images.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AI2D | 93.8% | self-reported llm-stats | link → |
| ChartQA | 88.1% | self-reported llm-stats | link → |
| DocVQA | 93.3% | self-reported llm-stats | link → |
| MathVista | 69.4% | self-reported llm-stats | link → |
| MM-MT-Bench | 74 | self-reported llm-stats | link → |
| MMMU | 64.0% | self-reported llm-stats | link → |
| VQAv2 | 80.9% | self-reported llm-stats | link → |