Replay-guided Adversarial Environment Design
2021 Β· Minqi Jiang, Michael Dennis, Jack Parker-Holder, et al.
Abstract
Deep reinforcement learning (RL) agents may successfully generalize to new settings if trained on an appropriately diverse set of environment and task configurations. Unsupervised Environment Design (UED) is a promising self-supervised RL paradigm, wherein the free parameters of an underspecified environment are automatically adapted during training to the agent's capabilities, leading to the emergence of diverse training environments. Here, we cast Prioritized Level Replay (PLR), an empirically successful but theoretically unmotivated method that selectively samples randomly-generated training levels, as UED. We argue that by curating completely random levels, PLR, too, can generate novel and complex levels for effective training. This insight reveals a natural class of UED methods we call Dual Curriculum Design (DCD). Crucially, DCD includes both PLR and a popular UED algorithm, PAIRED, as special cases and inherits similar theoretical guarantees. This connection allows us to develop
Authors
(none)
Tags
Stats
Related papers
- Discovering General Reinforcement Learning Algorithms With Adversarial Environment Design (2023)0.00
- Prioritized Level Replay (2020)0.00
- DRED: Zero-shot Transfer In Reinforcement Learning Via Data-regularised Environment Design (2024)1.81
- Emergent Complexity And Zero-shot Transfer Via Unsupervised Environment Design (2020)0.00
- MAESTRO: Open-ended Environment Design For Multi-agent Reinforcement Learning (2023)0.00
- Generating Automatic Curricula Via Self-supervised Active Domain Randomization (2020)0.00
- Learning To Design Games: Strategic Environments In Reinforcement Learning (2017)0.00
- Discovering Minimal Reinforcement Learning Environments (2024)0.00