Plan To Predict: Learning An Uncertainty-foreseeing Model For Model-based Reinforcement Learning
2023 Β· Zifan Wu, Chao Yu, Chen Chen, et al.
Abstract
In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples. However, learning an accurate model can be difficult since the policy is continually updated and the induced distribution over visited states used for model learning shifts accordingly. Prior methods alleviate this issue by quantifying the uncertainty of model-generated samples. However, these methods only quantify the uncertainty passively after the samples were generated, rather than foreseeing the uncertainty before model trajectories fall into those highly uncertain regions. The resulting low-quality samples can induce unstable learning targets and hinder the optimization of the policy. Moreover, while being learned to minimize one-step prediction errors, the model is generally used to predict for multiple steps, leading to a mismatch between the objectives of model learning and model usage. To this end, we propose *Plan To P
Authors
(none)
Tags
Stats
Related papers
- Acting Upon Imagination: When To Trust Imagined Trajectories In Model Based Reinforcement Learning (2021)0.00
- Deep Model-based Reinforcement Learning Via Estimated Uncertainty And Conservative Policy Optimization (2019)0.00
- How To Fine-tune The Model: Unified Model Shift And Model Bias Policy Optimization (2023)0.00
- Smart Exploration In Reinforcement Learning Using Bounded Uncertainty Models (2025)0.00
- Self-correcting Models For Model-based Reinforcement Learning (2016)0.00
- Efficient Model-based Reinforcement Learning Through Optimistic Policy Search And Planning (2020)0.00
- Policy-aware Model Learning For Policy Gradient Methods (2020)0.00
- Online Robust Reinforcement Learning With Model Uncertainty (2021)0.00