← all datasets

SaaS-Bench

Emerging
2papers using it
41HF downloads
0HF likes
2026first seen

SaaS-Bench is a benchmark comprising 23 deployable Software-as-a-Service systems across six professional domains, containing 106 tasks designed to evaluate Computer-Using Agents in realistic work scenarios that require long-horizon execution and involve both text-only and multimodal settings.

Papers using SaaS-Bench (2)

SaaS-Bench β€” datasets β€” ai-agents