LongFact
factuality official site →
LongFact evaluates factual precision over long-form generations containing many individual claims. Each claim is extracted and verified, and the model is scored on claim-level precision, measuring whether extended responses introduce unsupported or false statements.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: factuality, general. Language: en. Verified by llm-stats: no.