Qwen3 235B A22B

Qwen3 235B A22B is a large language model developed by Alibaba, featuring a Mixture-of-Experts (MoE) architecture with 235 billion total parameters and 22 billion activated parameters. It achieves competitive results in benchmark evaluations of coding, math, general capabilities, and more, compared to other top-tier models.

Benchmark results

Benchmark Score Tags Source
Aider 61.8% self-reported llm-stats link →
AIME 2024 85.7% self-reported llm-stats link →
AIME 2025 81.5% self-reported llm-stats link →
Arena Hard 95.6% self-reported llm-stats link →
BBH 88.9% self-reported llm-stats link →
BFCL 70.8% self-reported llm-stats link →
CRUX-O 79.0% self-reported llm-stats link →
EvalPlus 77.6% self-reported llm-stats link →
GPQA 47.5% self-reported llm-stats link →
GSM8k 94.4% self-reported llm-stats link →
Include 73.5% self-reported llm-stats link →
LiveBench 77.1% self-reported llm-stats link →
LiveCodeBench 70.7% self-reported llm-stats link →
MATH 71.8% self-reported llm-stats link →
MBPP 81.4% self-reported llm-stats link →
MGSM 83.5% self-reported llm-stats link →
MMLU 87.8% self-reported llm-stats link →
MMLU-Pro 68.2% self-reported llm-stats link →
MMLU-Redux 87.4% self-reported llm-stats link →
MMMLU 86.7% self-reported llm-stats link →
MultiLF 71.9% self-reported llm-stats link →
MultiPL-E 65.9% self-reported llm-stats link →
SuperGPQA 44.1% self-reported llm-stats link →