MENTOR: Guiding Hierarchical Reinforcement Learning With Human Feedback And Dynamic Distance Constraint
2024 Β· Xinglin Zhou, Yifu Yuan, Shaofu Yang, et al.
Abstract
Hierarchical reinforcement learning (HRL) provides a promising solution for complex tasks with sparse rewards of intelligent agents, which uses a hierarchical framework that divides tasks into subgoals and completes them sequentially. However, current methods struggle to find suitable subgoals for ensuring a stable learning process. Without additional guidance, it is impractical to rely solely on exploration or heuristics methods to determine subgoals in a large goal space. To address the issue, We propose a general hierarchical reinforcement learning framework incorporating human feedback and dynamic distance constraints (MENTOR). MENTOR acts as a "mentor", incorporating human feedback into high-level policy learning, to find better subgoals. As for low-level policy, MENTOR designs a dual policy for exploration-exploitation decoupling respectively to stabilize the training. Furthermore, although humans can simply break down tasks into subgoals to guide the right learning direction, su
Authors
(none)
Tags
Stats
Related papers
- Boosting Hierarchical Reinforcement Learning With Meta-learning For Complex Task Adaptation (2024)0.00
- Bidirectional-reachable Hierarchical Reinforcement Learning With Mutually Responsive Policies (2024)0.00
- Generating Adjacency-constrained Subgoals In Hierarchical Reinforcement Learning (2020)0.00
- Learning And Exploiting Multiple Subgoals For Fast Exploration In Hierarchical Reinforcement Learning (2019)0.00
- Hierarchical Reinforcement Learning With Advantage-based Auxiliary Rewards (2019)0.00
- Subgoal-based Hierarchical Reinforcement Learning For Multi-agent Collaboration (2024)0.00
- Learning Representations In Model-free Hierarchical Reinforcement Learning (2018)11.49
- Deep Reinforcement Learning From Hierarchical Preference Design (2023)2.00