← all datasets

T-maze

Emerging
2papers using it
2025first seen

The T-Maze is a synthetic benchmark used to evaluate the ability of models to handle long-horizon decision-making tasks in partially observable environments, featuring corridors that can extend up to one million steps.

Papers using T-maze (2)