Multi-Mission Tool Bench
Emerging2papers using it
2025first seen
The Multi-Mission Tool Bench is a benchmark that contains multiple interrelated missions designed to evaluate the robustness of large language model-based agents in dynamically adapting to evolving demands and mission-switching patterns.