A Simple Unified Uncertainty-guided Framework For Offline-to-online Reinforcement Learning
2023 Β· Siyuan Guo, Yanchao Sun, Jifeng Hu, et al.
Abstract
Offline reinforcement learning (RL) provides a promising solution to learning an agent fully relying on a data-driven paradigm. However, constrained by the limited quality of the offline dataset, its performance is often sub-optimal. Therefore, it is desired to further finetune the agent via extra online interactions before deployment. Unfortunately, offline-to-online RL can be challenging due to two main challenges: constrained exploratory behavior and state-action distribution shift. In view of this, we propose a Simple Unified uNcertainty-Guided (SUNG) framework, which naturally unifies the solution to both challenges with the tool of uncertainty. Specifically, SUNG quantifies uncertainty via a VAE-based state-action visitation density estimator. To facilitate efficient exploration, SUNG presents a practical optimistic exploration strategy to select informative actions with both high value and high uncertainty. Moreover, SUNG develops an adaptive exploitation method by applying cons
Authors
(none)
Tags
Stats
Related papers
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00
- Towards Robust Offline-to-online Reinforcement Learning Via Uncertainty And Smoothness (2023)5.24
- Selective Uncertainty Propagation In Offline RL (2023)0.00
- One Risk To Rule Them All: A Risk-sensitive Perspective On Model-based Offline Reinforcement Learning (2022)3.58
- Exploiting Generalization In Offline Reinforcement Learning Via Unseen State Augmentations (2023)0.00
- Revisiting Design Choices In Offline Model-based Reinforcement Learning (2021)6.34
- Pessimistic Bootstrapping For Uncertainty-driven Offline Reinforcement Learning (2022)0.00
- Uni-o4: Unifying Online And Offline Deep Reinforcement Learning With Multi-step On-policy Optimization (2023)0.00