LongCat-Flash-Thinking
LongCat-Flash-Thinking is Meituan's reasoning model built on the LongCat-Flash foundation with 560B total parameters (MoE, ~27B activated). It introduces a training pipeline specifically tuned for advanced reasoning, featuring Re-thinking Mode that delivers parallel reasoning paths for sophisticated decision-making. Achieves strong performance on mathematical reasoning, agentic tool use, and formal theorem proving benchmarks.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| AIME 2024 | 93.3% | self-reported llm-stats | link → |
| AIME 2025 | 90.6% | self-reported llm-stats | link → |
| ARC-AGI | 50.3% | self-reported llm-stats | link → |
| BFCL-v3 | 74.4% | self-reported llm-stats | link → |
| GPQA | 81.5% | self-reported llm-stats | link → |
| LiveCodeBench | 79.4% | self-reported llm-stats | link → |
| MATH-500 | 99.2% | self-reported llm-stats | link → |
| MMLU-Pro | 82.6% | self-reported llm-stats | link → |
| MMLU-Redux | 89.3% | self-reported llm-stats | link → |
| SWE-Bench Verified | 59.4% | self-reported llm-stats | link → |
| Tau2 Airline | 67.5% | self-reported llm-stats | link → |
| Tau2 Retail | 71.5% | self-reported llm-stats | link → |
| Tau2 Telecom | 83.1% | self-reported llm-stats | link → |
| ZebraLogic | 95.5% | self-reported llm-stats | link → |