DeepSeek VL2 Tiny
An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AI2D | 71.6% | self-reported llm-stats | link → |
| ChartQA | 81.0% | self-reported llm-stats | link → |
| DocVQA | 88.9% | self-reported llm-stats | link → |
| InfoVQA | 66.1% | self-reported llm-stats | link → |
| MathVista | 53.6% | self-reported llm-stats | link → |
| MMBench | 69.2% | self-reported llm-stats | link → |
| MMBench-V1.1 | 68.3% | self-reported llm-stats | link → |
| MME | 19.1% | self-reported llm-stats | link → |
| MMMU | 40.7% | self-reported llm-stats | link → |
| MMStar | 45.9% | self-reported llm-stats | link → |
| MMT-Bench | 53.2% | self-reported llm-stats | link → |
| OCRBench | 80.9% | self-reported llm-stats | link → |
| RealWorldQA | 64.2% | self-reported llm-stats | link → |
| TextVQA | 80.7% | self-reported llm-stats | link → |