Exploiting Generalization In Offline Reinforcement Learning Via Unseen State Augmentations
2023 Β· Nirbhay Modhe, Qiaozi Gao, Ashwin Kalyan, et al.
Abstract
Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions. Model-free methods penalize values at all unseen actions, while model-based methods are able to further exploit unseen states via model rollouts. However, such methods are handicapped in their ability to find unseen states far away from the available offline data due to two factors -- (a) very short rollout horizons in models due to cascading model errors, and (b) model rollouts originating solely from states observed in offline data. We relax the second assumption and present a novel unseen state augmentation strategy to allow exploitation of unseen states where the learned model and value estimates generalize. Our strategy finds unseen states by value-informed perturbations of seen states followed by filtering out states with epistemic uncertainty estimates too high (high error) or too low (too similar to
Authors
(none)
Tags
Stats
Related papers
- Equivariant Data Augmentation For Generalization In Offline Reinforcement Learning (2023)0.00
- Conservative Bayesian Model-based Value Expansion For Offline Policy Optimization (2022)0.00
- A Policy-guided Imitation Approach For Offline Reinforcement Learning (2022)0.00
- Model-based Offline Reinforcement Learning With Adversarial Data Augmentation (2025)0.00
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00
- Offline Meta Learning Of Exploration (2020)0.00
- Boosting Offline Reinforcement Learning With Residual Generative Modeling (2021)0.00
- Improving Zero-shot Generalization In Offline Reinforcement Learning Using Generalized Similarity Functions (2021)2.26