Domain Generalization For Robust Model-based Offline Reinforcement Learning
2022 Β· Alan Clark, Shoaib Ahmed Siddiqui, Robert Kirk, et al.
Abstract
Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin. We consider multi-demonstrator offline RL, a middle ground where we know which demonstrators generated each dataset, but make no assumptions about the underlying policies of the demonstrators. This is the most natural setting when collecting data from multiple human operators, yet remains unexplored. Since different demonstrators induce different data distributions, we show that this can be naturally framed as a domain generalization problem, with each demonstrator corresponding to a different domain. Specifically, we propose Domain-Invariant Model-based Offline RL (DIMORL), where we apply Risk Extrapolation (REx) (Krueger et al., 2020) to the process of learning dynamics and rewards models. Our results show that models trained with REx exhibit improved domain generalization performance when compared with the natural ba
Authors
(none)
Tags
Stats
Related papers
- DOMAIN: Mildly Conservative Model-based Offline Reinforcement Learning (2023)0.00
- An Optimistic Perspective On Offline Reinforcement Learning (2019)0.00
- Model-based Offline Reinforcement Learning With Adversarial Data Augmentation (2025)0.00
- Bridging Distributionally Robust Learning And Offline RL: An Approach To Mitigate Distribution Shift And Partial Data Coverage (2023)0.00
- Morel : Model-based Offline Reinforcement Learning (2020)0.00
- Overcoming Model Bias For Robust Offline Deep Reinforcement Learning (2020)11.58
- The Generalization Gap In Offline Reinforcement Learning (2023)0.00
- Statistical Guarantees For Offline Domain Randomization (2025)0.00