Grok-2

Grok-2 is a frontier language model with state-of-the-art reasoning capabilities, featuring advanced abilities in chat, coding, and reasoning. It demonstrates superior performance in visual math reasoning, document-based question answering, and excels across various academic benchmarks including reasoning, reading comprehension, math, and science.

Benchmark results

Benchmark Score Tags Source
DocVQA 93.6% self-reported llm-stats link →
GPQA 56.0% self-reported llm-stats link →
HumanEval 88.4% self-reported llm-stats link →
MATH 76.1% self-reported llm-stats link →
MathVista 69.0% self-reported llm-stats link →
MMLU 87.5% self-reported llm-stats link →
MMLU-Pro 75.5% self-reported llm-stats link →
MMMU 66.1% self-reported llm-stats link →