Grok-4

Grok 4, announced by xAI in summer 2025, represents a major leap in AI capabilities, described as 'the smartest AI in the world.' Built on version 6 of xAI's foundation model, it uses 100x more training compute than Grok 2 and 10x more reinforcement learning compute than Grok 3. The model achieves PhD-level performance across all academic disciplines simultaneously, scoring perfect on standardized tests like the SAT and near-perfect on graduate exams like the GRE. Unlike Grok 3, tool usage is built into the training process rather than relying on generalization. Trained using 200,000 GPUs, Grok 4 excels at complex reasoning, mathematical problem-solving, and coding tasks, though it has acknowledged weaknesses in multimodal capabilities that are being addressed in the next version.

Benchmark results

Benchmark Score Tags Source
AIME 2025 91.7% self-reported llm-stats link →
ARC-AGI v2 15.9% self-reported llm-stats link →
GPQA 87.5% self-reported llm-stats link →
HMMT25 90.0% self-reported llm-stats link →
Humanity's Last Exam 40.0% self-reported llm-stats link →
LiveCodeBench 79.0% self-reported llm-stats link →
USAMO25 37.5% self-reported llm-stats link →