DRED: Zero-shot Transfer In Reinforcement Learning Via Data-regularised Environment Design
2024 Β· Samuel Garcin, James Doran, Shangmin Guo, et al.
Abstract
Autonomous agents trained using deep reinforcement learning (RL) often lack the ability to successfully generalise to new environments, even when these environments share characteristics with the ones they have encountered during training. In this work, we investigate how the sampling of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents. We discover that, for deep actor-critic architectures sharing their base layers, prioritising levels according to their value loss minimises the mutual information between the agent's internal representation and the set of training levels in the generated training data. This provides a novel theoretical justification for the regularisation achieved by certain adaptive sampling strategies. We then turn our attention to unsupervised environment design (UED) methods, which assume control over level generation. We find that existing UED methods can significantly shift the training distribution, whi
Authors
(none)
Tags
Stats
Related papers
- How The Level Sampling Process Impacts Zero-shot Generalisation In Deep Reinforcement Learning (2023)0.00
- Emergent Complexity And Zero-shot Transfer Via Unsupervised Environment Design (2020)0.00
- Replay-guided Adversarial Environment Design (2021)0.00
- Inferring Behavior-specific Context Improves Zero-shot Generalization In Reinforcement Learning (2024)0.95
- DARLA: Improving Zero-shot Transfer In Reinforcement Learning (2017)0.00
- On Zero-shot Reinforcement Learning (2025)0.00
- Zero-shot Generalization Of Vision-based RL Without Data Augmentation (2024)0.00
- Discovering General Reinforcement Learning Algorithms With Adversarial Environment Design (2023)0.00