Hierarchy Through Composition With Linearly Solvable Markov Decision Processes
2016 Β· Andrew M. Saxe, Adam Earle, Benjamin Rosman
Abstract
Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
Authors
(none)
Tags
Stats
Related papers
- Globally Optimal Hierarchical Reinforcement Learning For Linearly-solvable Markov Decision Processes (2021)2.26
- Deep Hierarchical Reinforcement Learning Algorithm In Partially Observable Markov Decision Processes (2018)12.87
- Himac: Hierarchical Macro-micro Learning For Long-horizon LLM Agents (2026)0.00
- Guiding Multi-agent Multi-task Reinforcement Learning By A Hierarchical Framework With Logical Reward Shaping (2024)0.00
- Intrinsically Motivated Hierarchical Policy Learning In Multi-objective Markov Decision Processes (2023)4.52
- Hierarchical Decision Making Based On Structural Information Principles (2024)0.00
- Multi-horizon Representations With Hierarchical Forward Models For Reinforcement Learning (2022)0.00
- Exploring The Limits Of Hierarchical World Models In Reinforcement Learning (2024)6.34