TydiQA
reasoning official site →
A multilingual question answering benchmark covering 11 typologically diverse languages with 204K question-answer pairs. Questions are written by people seeking genuine information and data is collected directly in each language without translation to test model generalization across diverse linguistic structures.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: language, reasoning. Language: en. Verified by llm-stats: no.