ScreenSpot-v-2
Emerging6papers using it
917HF downloads
31HF likes
2025first seen
The 'ScreenSpot-v2' dataset/benchmark contains a collection of screenshots used to evaluate the performance of mobile device control agents in executing complex tasks requiring long-horizon planning and precise operations.
π€ Hugging Faceβ apache-2.0
Papers using ScreenSpot-v-2 (6)
- Hi-Agent: Hierarchical Vision-Language Agents for Mobile Device ControlGUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement LearningGUI-G$^2$: Gaussian Reward Modeling for GUI GroundingUI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time GroundingK^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device ControlGuirlVG: Incentivize GUI Visual Grounding via Empirical Exploration on Reinforcement Learning