Algorithmic Framework For Model-based Deep Reinforcement Learning With Theoretical Guarantees
2018 Β· Yuping Luo, Huazhe Xu, Yuanzhi Li, et al.
Abstract
Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL. However, the theoretical understanding of such methods has been rather limited. This paper introduces a novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees. We design a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward. The meta-algorithm iteratively builds a lower bound of the expected reward based on the estimated dynamical model and sample trajectories, and then maximizes the lower bound jointly over the policy and the model. The framework extends the optimism-in-face-of-uncertainty principle to non-linear dynamical models in a way that requires \textit\{no explicit\} uncertainty quantification. Instantiating our framework with simplification gives a variant of model-based RL algorithms Stochastic Lower Bounds Optimization (SLBO
Authors
(none)
Tags
Stats
Related papers
- When To Update Your Model: Constrained Model-based Reinforcement Learning (2022)2.26
- Simplifying Model-based RL: Learning Representations, Latent-space Models, And Policies With One Objective (2022)0.00
- OBLR-PO: A Theoretical Framework For Stable Reinforcement Learning (2025)0.00
- Assured Learning-enabled Autonomy: A Metacognitive Reinforcement Learning Framework (2021)0.00
- Conservative And Adaptive Penalty For Model-based Safe Reinforcement Learning (2021)0.00
- Theoretically Guaranteed Policy Improvement Distilled From Model-based Planning (2023)2.26
- Model-based Offline Reinforcement Learning With Pessimism-modulated Dynamics Belief (2022)0.00
- Meta-model-based Meta-policy Optimization (2020)0.00