Provable Domain Adaptation For Offline Reinforcement Learning With Limited Samples
2024 Β· Weiqin Chen, Xinjie Zhang, Sandipan Mishra, et al.
Abstract
Offline reinforcement learning (RL) learns effective policies from a static target dataset. The performance of state-of-the-art offline RL algorithms notwithstanding, it relies on the size of the target dataset, and it degrades if limited samples in the target dataset are available, which is often the case in real-world applications. To address this issue, domain adaptation that leverages auxiliary samples from related source datasets (such as simulators) can be beneficial. However, establishing the optimal way to trade off the limited target dataset and the large-but-biased source dataset while ensuring provably theoretical guarantees remains an open challenge. To the best of our knowledge, this paper proposes the first framework that theoretically explores the impact of the weights assigned to each dataset on the performance of offline RL. In particular, we establish performance bounds and the existence of the optimal weight, which can be computed in closed form under simplifying ass
Authors
(none)
Tags
Stats
Related papers
- Offline Meta-reinforcement Learning With Advantage Weighting (2020)0.00
- Active Advantage-aligned Online Reinforcement Learning With Offline Data (2025)0.00
- Beyond Uniform Sampling: Offline Reinforcement Learning With Imbalanced Datasets (2023)2.83
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Bridging Offline Reinforcement Learning And Imitation Learning: A Tale Of Pessimism (2021)0.00
- Leveraging Offline Data In Online Reinforcement Learning (2022)0.00
- Policy-driven World Model Adaptation For Robust Offline Model-based Reinforcement Learning (2025)0.00
- Optimality Inductive Biases And Agnostic Guidelines For Offline Reinforcement Learning (2021)0.00