MAS-Bench

Emerging

3papers using it

2025first seen

MAS-Bench is a benchmark designed to evaluate GUI-shortcut hybrid agents in the mobile domain, containing 139 complex tasks from 11 real-world applications, a knowledge base of 88 predefined shortcuts, and 9 evaluation metrics.

🔎 Find this dataset

Papers using MAS-Bench (3)

MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks2026

X-MAS: Towards Building Multi-agent Systems With Heterogeneous Llms2025 · 2 cites

Mas-bench: A Unified Benchmark For Shortcut-augmented Hybrid Mobile GUI Agents2025