DeepSeek VL2 Tiny

An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

Benchmark results

Benchmark Score Tags Source
AI2D 71.6% self-reported llm-stats link →
ChartQA 81.0% self-reported llm-stats link →
DocVQA 88.9% self-reported llm-stats link →
InfoVQA 66.1% self-reported llm-stats link →
MathVista 53.6% self-reported llm-stats link →
MMBench 69.2% self-reported llm-stats link →
MMBench-V1.1 68.3% self-reported llm-stats link →
MME 19.1% self-reported llm-stats link →
MMMU 40.7% self-reported llm-stats link →
MMStar 45.9% self-reported llm-stats link →
MMT-Bench 53.2% self-reported llm-stats link →
OCRBench 80.9% self-reported llm-stats link →
RealWorldQA 64.2% self-reported llm-stats link →
TextVQA 80.7% self-reported llm-stats link →