Deep Reinforcement Learning In A Handful Of Trials Using Probabilistic Dynamics Models
2018 Β· Kurtland Chua, Roberto Calandra, Rowan McAllister, et al.
Abstract
Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125 times fewer samples than Soft Actor Critic and Proximal Policy Optimization respectively on the half-cheetah task).
Authors
(none)
Tags
Stats
Related papers
- Is Model Ensemble Necessary? Model-based RL Via A Single Model With Lipschitz Regularized Value Function (2023)0.00
- Live In The Moment: Learning Dynamics Model Adapted To Evolving Policy (2022)0.00
- Deep Gaussian Covariance Network With Trajectory Sampling For Data-efficient Policy Search (2024)0.00
- Model-based Offline Reinforcement Learning With Pessimism-modulated Dynamics Belief (2022)0.00
- Robust Adversarial Policy Optimization Under Dynamics Uncertainty (2026)0.00
- Deep Model-based Reinforcement Learning Via Estimated Uncertainty And Conservative Policy Optimization (2019)0.00
- Planning With Exploration: Addressing Dynamics Bottleneck In Model-based Reinforcement Learning (2020)0.00
- A Model-based Approach For Sample-efficient Multi-task Reinforcement Learning (2019)0.00