Recurrent World Models Facilitate Policy Evolution
2018 · David Ha, Jürgen Schmidhuber
Abstract
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of paper at https://worldmodels.github.io
Authors
(none)
Tags
Stats
Related papers
- Do Transformer World Models Give Better Policy Gradients? (2024)0.00
- World Models Via Policy-guided Trajectory Diffusion (2023)0.00
- Smaller World Models For Reinforcement Learning (2020)0.00
- The Effectiveness Of World Models For Continual Reinforcement Learning (2022)0.00
- PWM: Policy Learning With Multi-task World Models (2024)0.00
- Object-centric World Models For Causality-aware Reinforcement Learning (2025)0.00
- STORM: Efficient Stochastic Transformer Based World Models For Reinforcement Learning (2023)4.52
- Reinforcement Learning With World Model (2019)0.00