Skip to content

Models Benchmarks Providers

Search models and benchmarks /

IFBench

general

Categories: general, instruction following
Modality: text
Language: en
Multilingual: No
Max score: 1
Scoring: %, higher is better
Verified by llm-stats: No

Instruction Following Benchmark evaluating model's ability to follow complex instructions

Leaderboard

Showing 20 of 24 results

Hermes 3 70B

81.2%

i
Nova 2 Pro

80.2%

i
Qwen3.7 Max

79.1%

i
Qwen3.5-27B

76.5%

i
Qwen3.5-397B-A17B

76.5%

i
Qwen3.5-122B-A10B

76.1%

i
MAI-Code-1-Flash

75.0%

i
Qwen3.6 Plus

74.2%

i
Nemotron 3 Super (120B A12B)

72.6%

i
Mercury 2

71.0%

i
Nova 2 Lite

70.8%

i
Qwen3.5-35B-A3B

70.2%

i
MiniMax M2.1

70.0%

i
GPT OSS 120B High

69.5%

i
MAI-Thinking-1

69.0%

i
Mistral Medium 3.5

69.0%

i
Nova 2 Omni

68.7%

i
K-EXAONE-236B-A23B

67.3%

i
Qwen3.5-9B

64.5%

i
Qwen3.5-4B

59.2%

i

Wikibench About Theme Content licensed CC BY-SA 4.0.