← all datasets

Embench

Emerging
3papers using it
2025first seen

Embench is a benchmark for interactive embodied tasks that evaluates agents' performance in both in-domain and out-of-domain scenarios.

Papers using Embench (3)