tau-2-bench
Emerging3papers using it
2026first seen
'tau2-Bench' is a benchmark used to evaluate the performance of models in tasks related to agentic intelligence, specifically focusing on reasoning and execution capabilities.
'tau2-Bench' is a benchmark used to evaluate the performance of models in tasks related to agentic intelligence, specifically focusing on reasoning and execution capabilities.