Procedural Generalization By Planning With Self-supervised World Models
2021 Β· Ankesh Anand, Jacob Walker, Yazhe Li, et al.
Abstract
One of the key promises of model-based reinforcement learning is the ability to generalize using an internal model of the world to make predictions in novel environments and tasks. However, the generalization ability of model-based agents is not well understood because existing work has focused on model-free agents when benchmarking generalization. Here, we explicitly measure the generalization ability of model-based agents in comparison to their model-free counterparts. We focus our analysis on MuZero (Schrittwieser et al., 2020), a powerful model-based agent, and evaluate its performance on both procedural and task generalization. We identify three factors of procedural generalization -- planning, self-supervised representation learning, and procedural data diversity -- and show that by combining these techniques, we achieve state-of-the art generalization performance and data efficiency on Procgen (Cobbe et al., 2019). However, we find that these factors do not always provide the sa
Authors
(none)
Tags
Stats
Related papers
- Illuminating Generalization In Deep Reinforcement Learning Through Procedural Level Generation (2018)0.00
- Good Actions Succeed, Bad Actions Generalize: A Case Study On Why RL Generalizes Better (2025)0.00
- Leveraging Procedural Generation To Benchmark Reinforcement Learning (2019)0.00
- Equivariant Muzero (2023)0.00
- Assessing Generalization In Deep Reinforcement Learning (2018)0.00
- Measuring And Characterizing Generalization In Deep Reinforcement Learning (2018)9.76
- Language-conditioned World Model Improves Policy Generalization By Reading Environmental Descriptions (2025)0.00
- Improving Generalization On The Procgen Benchmark With Simple Architectural Changes And Scale (2024)0.00