Hieros: Hierarchical Imagination On Structured State Space Sequence World Models
2023 Β· Paul Mattes, Rainer Schlosser, Ralf Herbrich
Abstract
One of the biggest challenges to modern deep reinforcement learning (DRL) algorithms is sample efficiency. Many approaches learn a world model in order to train an agent entirely in imagination, eliminating the need for direct environment interaction during training. However, these methods often suffer from either a lack of imagination accuracy, exploration capabilities, or runtime efficiency. We propose Hieros, a hierarchical policy that learns time abstracted world representations and imagines trajectories at multiple time scales in latent space. Hieros uses an S5 layer-based world model, which predicts next world states in parallel during training and iteratively during environment interaction. Due to the special properties of S5 layers, our method can train in parallel and predict next world states iteratively during imagination. This allows for more efficient training than RNN-based world models and more efficient imagination than Transformer-based world models. We show that our
Authors
(none)
Tags
Stats
Related papers
- Exploring The Limits Of Hierarchical World Models In Reinforcement Learning (2024)6.34
- Hierarchical Reinforcement Learning In Complex 3D Environments (2023)0.00
- Learning Representations In Model-free Hierarchical Reinforcement Learning (2018)11.49
- Transformers Are Sample-efficient World Models (2022)0.00
- Multi-horizon Representations With Hierarchical Forward Models For Reinforcement Learning (2022)0.00
- Hypothesis-driven Skill Discovery For Hierarchical Deep Reinforcement Learning (2019)2.26
- Object-centric World Models For Causality-aware Reinforcement Learning (2025)0.00
- Recurrent World Models Facilitate Policy Evolution (2018)0.00