Learning To Predict Without Looking Ahead: World Models Without Forward Prediction
2019 Β· C. Daniel Freeman, Luke Metz, David Ha
Abstract
Much of model-based reinforcement learning involves learning a model of an agent's world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware---e.g., a brain---arose as the byproduct of competing evolutionary pressures for survival, not minimization of a supervised forward-predictive loss via gradient descent. That useful models can arise out of the messy and slow optimization process of evolution suggests that forward-predictive modeling can arise as a side-effect of optimization under the right circumstances. Crucially, this optimization process need not explicitly be a forward-predictive loss. In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep. In doing so, we can coerce an agent into l
Authors
(none)
Tags
Stats
Related papers
- Forward-backward Reinforcement Learning (2018)0.00
- Partial Models For Building Adaptive Model-based Reinforcement Learning Agents (2024)0.00
- Plan To Predict: Learning An Uncertainty-foreseeing Model For Model-based Reinforcement Learning (2023)0.00
- Discovering Latent States For Model Learning: Applying Sensorimotor Contingencies Theory And Predictive Processing To Model Context (2016)0.00
- The Effectiveness Of World Models For Continual Reinforcement Learning (2022)0.00
- Continual Learning Using World Models For Pseudo-rehearsal (2019)0.00
- Recurrent World Models Facilitate Policy Evolution (2018)0.00
- Reset-free Reinforcement Learning With World Models (2024)0.00