τ-Bench

Emerging

9papers using it

2025first seen

The 'τ-Bench' dataset is used to evaluate the behavioral similarity of tool-use habits among different language model agents by modeling their actions as directed graphs.

🔎 Find this dataset

Papers using τ-Bench (9)

Goal Alignment in LLM-Based User Simulators for Conversational AI2025 · 14 cites

Remember When It Matters: Proactive Memory Agent for Long-Horizon Agents2026

Scaling Agentic Capabilities via Grounded Interaction Synthesis2026

Uncertainty-Aware Clarification in LLM Agents with Information Gain2026

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors2026

LiteResearcher: A Scalable Agentic RL Training Framework for Deep Research Agent2026

ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control2026

ScaleEnv: Scaling Environment Synthesis from Scratch for Generalist Interactive Tool-Use Agent Training2026

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization2026