DeepSearchQA
reasoning
DeepSearchQA is a benchmark for evaluating deep search and question-answering capabilities, testing models' ability to perform multi-hop reasoning and information retrieval across complex knowledge domains.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: agents, reasoning, search. Language: en. Verified by llm-stats: no.