Bidirectional-reachable Hierarchical Reinforcement Learning With Mutually Responsive Policies
2024 Β· Yu Luo, Fuchun Sun, Tianying Ji, et al.
Abstract
Hierarchical reinforcement learning (HRL) addresses complex long-horizon tasks by skillfully decomposing them into subgoals. Therefore, the effectiveness of HRL is greatly influenced by subgoal reachability. Typical HRL methods only consider subgoal reachability from the unilateral level, where a dominant level enforces compliance to the subordinate level. However, we observe that when the dominant level becomes trapped in local exploration or generates unattainable subgoals, the subordinate level is negatively affected and cannot follow the dominant level's actions. This can potentially make both levels stuck in local optima, ultimately hindering subsequent subgoal reachability. Allowing real-time bilateral information sharing and error correction would be a natural cure for this issue, which motivates us to propose a mutual response mechanism. Based on this, we propose the Bidirectional-reachable Hierarchical Policy Optimization (BrHPO)--a simple yet effective algorithm that also enj
Authors
(none)
Tags
Stats
Related papers
- Hierarchical Reinforcement Learning With Advantage-based Auxiliary Rewards (2019)0.00
- MENTOR: Guiding Hierarchical Reinforcement Learning With Human Feedback And Dynamic Distance Constraint (2024)6.34
- Boosting Hierarchical Reinforcement Learning With Meta-learning For Complex Task Adaptation (2024)0.00
- Subgoal-based Hierarchical Reinforcement Learning For Multi-agent Collaboration (2024)0.00
- Guided Cooperation In Hierarchical Reinforcement Learning Via Model-based Rollout (2023)0.00
- Generating Adjacency-constrained Subgoals In Hierarchical Reinforcement Learning (2020)0.00
- Learning And Exploiting Multiple Subgoals For Fast Exploration In Hierarchical Reinforcement Learning (2019)0.00
- Developing Cooperative Policies For Multi-stage Reinforcement Learning Tasks (2022)0.00