BrowseComp
Emerging7papers using it
5,053HF downloads
6HF likes
2025first seen
'BrowseComp' is a benchmark dataset that contains synthesized samples used to evaluate the performance of search agents, particularly in their ability to perform complex, multi-hop reasoning tasks.
π€ Hugging Faceβ apache-2.0
Papers using BrowseComp (7)
- SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep ResearchOpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training DataOpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty TrajectoriesBrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of
Deep-Research AgentSSRL: Self-Search Reinforcement LearningA^2FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid
ReasoningWebLeaper: Empowering Efficiency and Efficacy in WebAgent via Enabling
Info-Rich Seeking