Toward Robust Long Range Policy Transfer
2021 Β· Wei-Cheng Tseng, Jin-Siang Lin, Yao-Min Feng, et al.
Abstract
Humans can master a new task within a few trials by drawing upon skills acquired through prior experience. To mimic this capability, hierarchical models combining primitive policies learned from prior tasks have been proposed. However, these methods fall short comparing to the human's range of transferability. We propose a method, which leverages the hierarchical structure to train the combination function and adapt the set of diverse primitive polices alternatively, to efficiently produce a range of complex behaviors on challenging new tasks. We also design two regularization terms to improve the diversity and utilization rate of the primitives in the pre-training phase. We demonstrate that our method outperforms other recent policy transfer methods by combining and adapting these reusable primitives in tasks with continuous action space. The experiment results further show that our approach provides a broader transferring range. The ablation study also shows the regularization terms
Authors
(none)
Tags
Stats
Related papers
- IOB: Integrating Optimization Transfer And Behavior Transfer For Multi-policy Reuse (2023)5.24
- An Advantage Based Policy Transfer Algorithm For Reinforcement Learning With Measures Of Transferability (2023)0.00
- Diversity For Contingency: Learning Diverse Behaviors For Efficient Adaptation And Transfer (2023)0.00
- MULTIPOLAR: Multi-source Policy Aggregation For Transfer Reinforcement Learning Between Diverse Environmental Dynamics (2019)7.81
- Post-convergence Sim-to-real Policy Transfer: A Principled Alternative To Cherry-picking (2025)0.00
- On The Benefits Of Pixel-based Hierarchical Policies For Task Generalization (2024)0.00
- Exploiting Hierarchy For Learning And Transfer In Kl-regularized RL (2019)0.00
- Open-ended Diverse Solution Discovery With Regulated Behavior Patterns For Cross-domain Adaptation (2022)0.00