DeepSeek-V4-Pro-Max

DeepSeek-V4-Pro-Max is the maximum reasoning effort mode of DeepSeek-V4-Pro, a 1.6T-parameter MoE model with 49B activated parameters and a 1M-token context window. It introduces a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for dramatically improved long-context efficiency, requiring only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2 at 1M-token context.

CodeForces

100.0%

i
HMMT Feb 26

95.2%

i
LiveCodeBench

93.5%

i
MathArena Apex

90.2%

i
GPQA

90.1%

i
IMO-AnswerBench

89.8%

i
MMLU-Pro

87.5%

i
CSimpleQA

84.4%

i
MRCR 1M

83.5%

i
BrowseComp

83.4%

i
SWE-Bench Verified

80.6%

i
SWE-bench Multilingual

76.2%

i
MCP Atlas

73.6%

i
Terminal-Bench 2.0

67.9%

i
CorpusQA 1M

62.0%

i
SimpleQA

57.9%

i
SWE-Bench Pro

55.4%

i
Toolathlon

51.8%

i
Humanity's Last Exam

48.2%

i
GDPval-AA

1,554

i