LongCat-Flash-Chat

LongCat-Flash-Chat is Meituan's first open-source foundation model, a 560B parameter Mixture-of-Experts (MoE) model that dynamically activates 18.6B-31.3B parameters (~27B average) based on contextual demands. It features Zero-Computation Experts for efficient routing and supports 128K context. Optimized for conversational and agentic tasks, it shows competitive performance across reasoning, coding, instruction following, and domain benchmarks with particular strengths in tool use and complex multi-step interactions. Achieves over 100 tokens per second on H800 GPUs.

Benchmark results

Benchmark Score Tags Source
AIME 2025 61.3% self-reported llm-stats link →
CMMLU 84.3% self-reported llm-stats link →
DROP 79.1% self-reported llm-stats link →
GPQA 73.2% self-reported llm-stats link →
HumanEval 88.4% self-reported llm-stats link →
IFEval 89.6% self-reported llm-stats link →
LiveCodeBench 48.0% self-reported llm-stats link →
MATH-500 96.4% self-reported llm-stats link →
MMLU 89.7% self-reported llm-stats link →
MMLU-Pro 82.7% self-reported llm-stats link →
SWE-Bench Verified 60.4% self-reported llm-stats link →
Tau2 Airline 58.0% self-reported llm-stats link →
Tau2 Retail 71.3% self-reported llm-stats link →
Tau2 Telecom 73.7% self-reported llm-stats link →
Terminal-Bench 39.5% self-reported llm-stats link →
ZebraLogic 89.3% self-reported llm-stats link →