← all datasets

TerminalBench-2

Emerging
5papers using it
2026first seen

'TerminalBench-2' is a dataset used to evaluate the performance and capabilities of meta-agents in managing and manipulating agentic execution states during complex tasks.

Papers using TerminalBench-2 (5)

TerminalBench-2 β€” datasets β€” ai-agents