Catalogue
Latest models
| Model | Provider | Released | Context | Weights |
|---|---|---|---|---|
| U2 | Unisound | Jun 5, 2026 | — | proprietary |
| MAI-Code-1-Flash | Microsoft | Jun 2, 2026 | — | proprietary |
| MAI-Thinking-1 | Microsoft | Jun 2, 2026 | — | proprietary |
| MiniMax M3 | MiniMax | Jun 1, 2026 | — | open |
| Claude Opus 4.8 | Anthropic | May 28, 2026 | — | proprietary |
| Gemini 3.5 Flash | May 19, 2026 | — | proprietary |
Featured leaderboards
| # | Model | Score |
|---|---|---|
| 1 | Kimi K2.5 | 70.0% |
| 2 | Qwen3.5-397B-A17B | 68.7% |
| 3 | Qwen3.6 Plus | 68.3% |
| 4 | Qwen3.5-122B-A10B | 66.9% |
| 5 | Qwen3.5-27B | 66.1% |
| # | Model | Score |
|---|---|---|
| 1 | Mistral Small 3 24B Base | 65.8% |
| 2 | Ministral 3 (14B Base 2512) | 64.8% |
| 3 | Hermes 3 70B | 56.2% |
| 4 | Gemma 2 27B | 55.1% |
| 5 | Gemma 2 9B | 52.8% |
| # | Model | Score |
|---|---|---|
| 1 | Claude 3.5 Sonnet | 94.7% |
| 2 | Qwen3.6 Plus | 94.4% |
| 3 | GPT-4o | 94.2% |
| 4 | Pixtral Large | 93.8% |
| 5 | Qwen3.5-122B-A10B | 93.3% |
| # | Model | Score |
|---|---|---|
| 1 | GPT-5 | 88.0% |
| 2 | Gemini 2.5 Pro Preview 06-05 | 82.2% |
| 3 | o3 | 81.3% |
| 4 | Gemini 2.5 Pro | 76.5% |
| 5 | DeepSeek-V3.2-Exp | 74.5% |