56 tasks
Emerging1papers using it
2026first seen
The '56 tasks' dataset is a novel simulation benchmark that includes a variety of tasks designed to evaluate multistep reasoning and linguistic variation in robotic manipulation scenarios.
The '56 tasks' dataset is a novel simulation benchmark that includes a variety of tasks designed to evaluate multistep reasoning and linguistic variation in robotic manipulation scenarios.