Solving Continual Offline Reinforcement Learning With Decision Transformer
2024 Β· Kaixin Huang, Li Shen, Chen Zhao, et al.
Abstract
Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning, enabling agents to learn multiple tasks from static datasets without forgetting prior tasks. However, CORL faces challenges in balancing stability and plasticity. Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing. We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues. We first compare AC-based offline algorithms with DT in the CORL framework. DT offers advantages in learning efficiency, distribution shift mitigation, and zero-shot generalization but exacerbates the forgetting problem during supervised parameter updates. We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem. MH-DT stores task-specific knowledge using
Authors
(none)
Tags
Stats
Related papers
- When Should We Prefer Decision Transformers For Offline Reinforcement Learning? (2023)0.00
- Tsn-affinity: Similarity-driven Parameter Reuse For Continual Offline Reinforcement Learning (2026)0.00
- OER: Offline Experience Replay For Continual Offline Reinforcement Learning (2023)3.58
- Belief-based Offline Reinforcement Learning For Delay-robust Policy Optimization (2025)0.00
- Offline Pre-trained Multi-agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks (2021)0.00
- Harmodt: Harmony Multi-task Decision Transformer For Offline Reinforcement Learning (2024)0.00
- Self-confirming Transformer For Belief-conditioned Adaptation In Offline Multi-agent Reinforcement Learning (2023)0.00
- Q-value Regularized Decision Convformer For Offline Reinforcement Learning (2024)0.00