LongCat-Flash-Thinking

LongCat-Flash-Thinking is Meituan's reasoning model built on the LongCat-Flash foundation with 560B total parameters (MoE, ~27B activated). It introduces a training pipeline specifically tuned for advanced reasoning, featuring Re-thinking Mode that delivers parallel reasoning paths for sophisticated decision-making. Achieves strong performance on mathematical reasoning, agentic tool use, and formal theorem proving benchmarks.

Benchmark results

Benchmark Score Tags Source
AIME 2024 93.3% self-reported llm-stats link →
AIME 2025 90.6% self-reported llm-stats link →
ARC-AGI 50.3% self-reported llm-stats link →
BFCL-v3 74.4% self-reported llm-stats link →
GPQA 81.5% self-reported llm-stats link →
LiveCodeBench 79.4% self-reported llm-stats link →
MATH-500 99.2% self-reported llm-stats link →
MMLU-Pro 82.6% self-reported llm-stats link →
MMLU-Redux 89.3% self-reported llm-stats link →
SWE-Bench Verified 59.4% self-reported llm-stats link →
Tau2 Airline 67.5% self-reported llm-stats link →
Tau2 Retail 71.5% self-reported llm-stats link →
Tau2 Telecom 83.1% self-reported llm-stats link →
ZebraLogic 95.5% self-reported llm-stats link →