DeepSeek VL2 Tiny

An advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

DocVQA

88.9%

i
ChartQA

81.0%

i
OCRBench

80.9%

i
TextVQA

80.7%

i
AI2D

71.6%

i
MMBench

69.2%

i
MMBench-V1.1

68.3%

i
InfoVQA

66.1%

i
RealWorldQA

64.2%

i
MathVista

53.6%

i
MMT-Bench

53.2%

i
MMStar

45.9%

i
MMMU

40.7%

i
MME

19.1%

i