SIFO-Multiturn
general
SIFO-Multiturn evaluates instruction following capabilities in multi-turn conversational settings, testing how well models maintain context and follow instructions across multiple exchanges.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 100. Categories: agents, general, structured_output. Language: en. Verified by llm-stats: no.