Adversarially Robust Decision Transformer
2024 Β· Xiaohang Tang, Afonso Marques, Parameswaran Kamalaruban, et al.
Abstract
Decision Transformer (DT), as one of the representative Reinforcement Learning via Supervised Learning (RvS) methods, has achieved strong performance in offline learning tasks by leveraging the powerful Transformer architecture for sequential decision-making. However, in adversarial environments, these methods can be non-robust, since the return is dependent on the strategies of both the decision-maker and adversary. Training a probabilistic model conditioned on observed return to predict action can fail to generalize, as the trajectories that achieve a return in the dataset might have done so due to a suboptimal behavior adversary. To address this, we propose a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go. ARDT aligns the target return with the worst-case return learned through minimax expectile regression, thereby enhancing robustness against powerful test-time adversari
Authors
(none)
Tags
Stats
Related papers
- Return-aligned Decision Transformer (2024)1.69
- Return Augmented Decision Transformer For Off-dynamics Reinforcement Learning (2024)0.00
- When Should We Prefer Decision Transformers For Offline Reinforcement Learning? (2023)0.00
- Q-learning Decision Transformer: Leveraging Dynamic Programming For Conditional Sequence Modelling In Offline RL (2022)0.00
- Waypoint Transformer: Reinforcement Learning Via Supervised Learning With Intermediate Targets (2023)0.00
- DODT: Enhanced Online Decision Transformer Learning Through Dreamer's Actor-critic Trajectory Forecasting (2024)0.00
- Double Check My Desired Return: Transformer With Target Alignment For Offline Reinforcement Learning (2025)0.00
- You Can't Count On Luck: Why Decision Transformers And Rvs Fail In Stochastic Environments (2022)0.00