Deep Model-based Reinforcement Learning Via Estimated Uncertainty And Conservative Policy Optimization
2019 Β· Qi Zhou, Houqiang Li, Jie Wang
Abstract
Model-based reinforcement learning algorithms tend to achieve higher sample efficiency than model-free methods. However, due to the inevitable errors of learned models, model-based methods struggle to achieve the same asymptotic performance as model-free methods. In this paper, We propose a Policy Optimization method with Model-Based Uncertainty (POMBU)---a novel model-based approach---that can effectively improve the asymptotic performance using the uncertainty in Q-values. We derive an upper bound of the uncertainty, based on which we can approximate the uncertainty accurately and efficiently for model-based methods. We further propose an uncertainty-aware policy optimization algorithm that optimizes the policy conservatively to encourage performance improvement with high probability. This can significantly alleviate the overfitting of policy to inaccurate models. Experiments show POMBU can outperform existing state-of-the-art policy optimization algorithms in terms of sample eff
Authors
(none)
Tags
Stats
Related papers
- When To Trust Your Model: Model-based Policy Optimization (2019)0.00
- Bayesian Policy Optimization For Model Uncertainty (2018)0.00
- Uncertainty-aware Policy Optimization: A Robust, Adaptive Trust Region Approach (2020)0.00
- Policy Optimization With Model-based Explorations (2018)5.84
- Efficient Model-based Reinforcement Learning Through Optimistic Policy Search And Planning (2020)0.00
- Plan To Predict: Learning An Uncertainty-foreseeing Model For Model-based Reinforcement Learning (2023)0.00
- Conservative Dual Policy Optimization For Efficient Model-based Reinforcement Learning (2022)0.00
- Deterministic Uncertainty Propagation For Improved Model-based Offline Reinforcement Learning (2024)0.00