← all datasets

tau-2-bench

Emerging
3papers using it
2026first seen

'tau2-Bench' is a benchmark used to evaluate the performance of models in tasks related to agentic intelligence, specifically focusing on reasoning and execution capabilities.

Papers using tau-2-bench (3)

tau-2-bench β€” datasets β€” ai-for-code