Learning Dynamics Model In Reinforcement Learning By Incorporating The Long Term Future
2019 Β· Nan Rosemary Ke, Amanpreet Singh, Ahmed Touati, et al.
Abstract
In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably intertwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner's solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a varie
Authors
(none)
Tags
Stats
Related papers
- Predicting Future Actions Of Reinforcement Learning Agents (2024)3.58
- Plan To Predict: Learning An Uncertainty-foreseeing Model For Model-based Reinforcement Learning (2023)0.00
- Latent Variable Representation For Reinforcement Learning (2022)0.00
- Continual Visual Reinforcement Learning With A Life-long World Model (2023)2.26
- Learning To Combat Compounding-error In Model-based Reinforcement Learning (2019)0.00
- Learning When To Act: Interval-aware Reinforcement Learning With Predictive Temporal Structure (2026)0.00
- Learning Off-policy With Model-based Intrinsic Motivation For Active Online Exploration (2024)0.00
- Towards A Simple Approach To Multi-step Model-based Reinforcement Learning (2018)0.00