← all datasets

tau-2-bench

Emerging
15papers using it
2025first seen

'Tau2 Bench' is a dataset/benchmark used to evaluate the performance of tool-use agents by providing structured tasks that capture interaction dynamics and the effectiveness of different strategies in tool invocation and environmental response.

Papers using tau-2-bench (15)

tau-2-bench β€” datasets β€” ai-agents