How The Level Sampling Process Impacts Zero-shot Generalisation In Deep Reinforcement Learning
2023 Β· Samuel Garcin, James Doran, Shangmin Guo, et al.
Abstract
A key limitation preventing the wider adoption of autonomous agents trained via deep reinforcement learning (RL) is their limited ability to generalise to new environments, even when these share similar characteristics with environments encountered during training. In this work, we investigate how a non-uniform sampling strategy of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents, considering two failure modes: overfitting and over-generalisation. As a first step, we measure the mutual information (MI) between the agent's internal representation and the set of training levels, which we find to be well-correlated to instance overfitting. In contrast to uniform sampling, adaptive sampling strategies prioritising levels based on their value loss are more effective at maintaining lower MI, which provides a novel theoretical justification for this class of techniques. We then turn our attention to unsupervised environment design (U
Authors
(none)
Tags
Stats
Related papers
- DRED: Zero-shot Transfer In Reinforcement Learning Via Data-regularised Environment Design (2024)1.81
- Illuminating Generalization In Deep Reinforcement Learning Through Procedural Level Generation (2018)0.00
- On Zero-shot Reinforcement Learning (2025)0.00
- Emergent Complexity And Zero-shot Transfer Via Unsupervised Environment Design (2020)0.00
- Inferring Behavior-specific Context Improves Zero-shot Generalization In Reinforcement Learning (2024)0.95
- Prioritized Level Replay (2020)0.00
- Good Actions Succeed, Bad Actions Generalize: A Case Study On Why RL Generalizes Better (2025)0.00
- A Unified Framework For Zero-shot Reinforcement Learning (2025)0.00