A Provably Efficient Option-based Algorithm For Both High-level And Low-level Learning
2024 Β· Gianluca Drappo, Alberto Maria Metelli, Marcello Restelli
Abstract
Hierarchical Reinforcement Learning (HRL) approaches have shown successful results in solving a large variety of complex, structured, long-horizon problems. Nevertheless, a full theoretical understanding of this empirical evidence is currently missing. In the context of the *option* framework, prior research has devised efficient algorithms for scenarios where options are fixed, and the high-level policy selecting among options only has to be learned. However, the fully realistic scenario in which both the high-level and the low-level policies are learned is surprisingly disregarded from a theoretical perspective. This work makes a step towards the understanding of this latter scenario. Focusing on the finite-horizon problem, we present a meta-algorithm alternating between regret minimization algorithms instanced at different (high and low) temporal abstractions. At the higher level, we treat the problem as a Semi-Markov Decision Process (SMDP), with fixed low-level policies, while at
Authors
(none)
Tags
Stats
Related papers
- Learning And Exploiting Multiple Subgoals For Fast Exploration In Hierarchical Reinforcement Learning (2019)0.00
- Hierarchical Reinforcement Learning With Advantage-based Auxiliary Rewards (2019)0.00
- Reusable Options Through Gradient-based Meta Learning (2022)0.00
- Hierarchical Reinforcement Learning Via Advantage-weighted Information Maximization (2019)0.00
- Autonomous Option Invention For Continual Hierarchical Reinforcement Learning And Planning (2024)2.26
- Bidirectional-reachable Hierarchical Reinforcement Learning With Mutually Responsive Policies (2024)0.00
- Enhancing Hierarchical Reinforcement Learning Through Change Point Detection In Time Series (2025)0.00
- HTMRL: Biologically Plausible Reinforcement Learning With Hierarchical Temporal Memory (2020)0.00