NoLiMa

reasoning

NoLiMa (No Literal Matching) is a long-context benchmark extending needle-in-a-haystack tests with minimal lexical overlap between questions and needles, requiring models to infer latent associations rather than relying on surface-level matching. Published at ICML 2025.

Leaderboard

No results yet.