PC-MLP: Model-based Reinforcement Learning With Policy Cover Guided Exploration
2021 Β· Yuda Song, Wen Sun
Abstract
Model-based Reinforcement Learning (RL) is a popular learning paradigm due to its potential sample efficiency compared to model-free RL. However, existing empirical model-based RL approaches lack the ability to explore. This work studies a computationally and statistically efficient model-based algorithm for both Kernelized Nonlinear Regulators (KNR) and linear Markov Decision Processes (MDPs). For both models, our algorithm guarantees polynomial sample complexity and only uses access to a planning oracle. Experimentally, we first demonstrate the flexibility and efficacy of our algorithm on a set of exploration challenging control tasks where existing empirical model-based RL approaches completely fail. We then show that our approach retains excellent performance even in common dense reward control benchmarks that do not require heavy exploration. Finally, we demonstrate that our method can also perform reward-free exploration efficiently. Our code can be found at https://github.com/yu
Authors
(none)
Tags
Stats
Related papers
- PC-PG: Policy Cover Directed Exploration For Provable Policy Gradient Learning (2020)0.00
- Off-policy Reinforcement Learning With Model-based Exploration Augmentation (2025)0.00
- Learning Off-policy With Model-based Intrinsic Motivation For Active Online Exploration (2024)0.00
- On Reward-free RL With Kernel And Neural Function Approximations: Single-agent MDP And Markov Game (2021)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Efficient Model-based Reinforcement Learning Through Optimistic Policy Search And Planning (2020)0.00
- Model-based Safe Deep Reinforcement Learning Via A Constrained Proximal Policy Optimization Algorithm (2022)5.24
- Simplifying Model-based RL: Learning Representations, Latent-space Models, And Policies With One Objective (2022)0.00