← all datasets

Multi-Mission Tool Bench

Emerging
2papers using it
2025first seen

The Multi-Mission Tool Bench is a benchmark that contains multiple interrelated missions designed to evaluate the robustness of large language model-based agents in dynamically adapting to evolving demands and mission-switching patterns.

Papers using Multi-Mission Tool Bench (2)

Multi-Mission Tool Bench β€” datasets β€” ai-agents