Sample Efficient Reinforcement Learning Via Model-ensemble Exploration And Exploitation
2021 Β· Yao Yao, Li Xiao, Zhicheng An, et al.
Abstract
Model-based deep reinforcement learning has achieved success in various domains that require high sample efficiencies, such as Go and robotics. However, there are some remaining issues, such as planning efficient explorations to learn more accurate dynamic models, evaluating the uncertainty of the learned models, and more rational utilization of models. To mitigate these issues, we present MEEE, a model-ensemble method that consists of optimistic exploration and weighted exploitation. During exploration, unlike prior methods directly selecting the optimal action that maximizes the expected accumulative return, our agent first generates a set of action candidates and then seeks out the optimal action that takes both expected return and future observation novelty into account. During exploitation, different discounted weights are assigned to imagined transition tuples according to their model uncertainty respectively, which will prevent model predictive error propagation in agent trainin
Authors
(none)
Tags
Stats
Related papers
- Sample-efficient Reinforcement Learning With Maximum Entropy Mellowmax Episodic Control (2019)0.00
- SEERL: Sample Efficient Ensemble Reinforcement Learning (2020)2.26
- Sample-efficient Reinforcement Learning With Stochastic Ensemble Value Expansion (2018)0.00
- Model-based Active Exploration (2018)0.00
- Decoupled Exploration And Exploitation Policies For Sample-efficient Reinforcement Learning (2021)0.00
- Strategically Efficient Exploration In Competitive Multi-agent Reinforcement Learning (2021)0.00
- Off-policy Reinforcement Learning With Model-based Exploration Augmentation (2025)0.00
- Learning Off-policy With Model-based Intrinsic Motivation For Active Online Exploration (2024)0.00