AI Safety Gridworlds
Emerging1papers using it
2026first seen
AI Safety Gridworlds is a benchmark containing structured, low-dimensional environments used to evaluate the ability of agents to discover safety objectives based on sparse binary danger signals.