DARA: Dynamics-aware Reward Augmentation In Offline Reinforcement Learning
2022 Β· Jinxin Liu, Hongyin Zhang, Donglin Wang
Abstract
Offline reinforcement learning algorithms promise to be applicable in settings where a fixed dataset is available and no new experience can be acquired. However, such formulation is inevitably offline-data-hungry and, in practice, collecting a large offline dataset for one specific task over one specific environment is also costly and laborious. In this paper, we thus 1) formulate the offline dynamics adaptation by using (source) offline data collected from another dynamics to relax the requirement for the extensive (target) offline data, 2) characterize the dynamics shift problem in which prior offline methods do not scale well, and 3) derive a simple dynamics-aware reward augmentation (DARA) framework from both model-free and model-based offline settings. Specifically, DARA emphasizes learning from those source transition pairs that are adaptive for the target environment and mitigates the offline dynamics shift by characterizing state-action-next-state pairs instead of the typical s
Authors
(none)
Tags
Stats
Related papers
- Debiased Offline Representation Learning For Fast Online Adaptation In Non-stationary Dynamics (2024)0.00
- MOBODY: Model Based Off-dynamics Offline Reinforcement Learning (2025)0.00
- Reward-consistent Dynamics Models Are Strongly Generalizable For Offline Reinforcement Learning (2023)0.00
- Behavioral Priors And Dynamics Models: Improving Performance And Domain Transfer In Offline RL (2021)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Understanding When Dynamics-invariant Data Augmentations Benefit Model-free Reinforcement Learning Updates (2023)0.00
- Data Valuation For Offline Reinforcement Learning (2022)0.00
- Revisiting Design Choices In Offline Model-based Reinforcement Learning (2021)6.34