Hierarchical Prompt Decision Transformer: Improving Few-shot Policy Generalization With Global And Adaptive Guidance
2024 Β· Zhe Wang, Haozhu Wang, Yanjun Qi
Abstract
Decision transformers recast reinforcement learning as a conditional sequence generation problem, offering a simple but effective alternative to traditional value or policy-based methods. A recent key development in this area is the integration of prompting in decision transformers to facilitate few-shot policy generalization. However, current methods mainly use static prompt segments to guide rollouts, limiting their ability to provide context-specific guidance. Addressing this, we introduce a hierarchical prompting approach enabled by retrieval augmentation. Our method learns two layers of soft tokens as guiding prompts: (1) global tokens encapsulating task-level information about trajectories, and (2) adaptive tokens that deliver focused, timestep-specific instructions. The adaptive tokens are dynamically retrieved from a curated set of demonstration segments, ensuring context-aware guidance. Experiments across seven benchmark tasks in the MuJoCo and MetaWorld environments demonstra
Authors
(none)
Tags
Stats
Related papers
- P2DT: Mitigating Forgetting In Task-incremental Learning With Progressive Prompt Decision Transformer (2024)3.58
- Q-learning Decision Transformer: Leveraging Dynamic Programming For Conditional Sequence Modelling In Offline RL (2022)0.00
- DODT: Enhanced Online Decision Transformer Learning Through Dreamer's Actor-critic Trajectory Forecasting (2024)0.00
- Updet: Universal Multi-agent Reinforcement Learning Via Policy Decoupling With Transformers (2021)0.00
- The Quality-diversity Transformer: Generating Behavior-conditioned Trajectories With Decision Transformers (2023)6.77
- Generalized Decision Transformer For Offline Hindsight Information Matching (2021)0.00
- Decision Mamba: Reinforcement Learning Via Sequence Modeling With Selective State Spaces (2024)0.00
- Waypoint Transformer: Reinforcement Learning Via Supervised Learning With Intermediate Targets (2023)0.00