โ† all datasets

StableToolBench

Emerging
12papers using it
2024first seen

StableToolBench is a cost-augmented benchmark that evaluates the performance of budget-constrained tool-augmented agents in solving multi-step tasks while adhering to strict monetary budgets.

Papers using StableToolBench (12)

StableToolBench โ€” datasets โ€” ai-agents