← all datasets

DeepSearchQA

Emerging
2papers using it
16,490HF downloads
121HF likes
2026first seen

DeepSearchQA A 900-prompt factuality benchmark from Google DeepMind, designed to evaluate agents on difficult multi-step information-seeking tasks across 17 different fields. ▶ Google DeepMind Release Blog Post▶ DeepSearchQA Leaderboard on Kaggle▶ Technical Report▶ Evaluation Starter Code Benchmark DeepSearchQA is a 90

Papers using DeepSearchQA (2)

DeepSearchQA — datasets — reinforcement-learning