Accelerating Residual Reinforcement Learning With Uncertainty Estimation
2025 Β· Lakshita Dodeja, Karl Schmeckpeper, Shivam Vats, et al.
Abstract
Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies. We propose two improvements to Residual RL that further enhance its sample efficiency and make it suitable for stochastic base policies. First, we leverage uncertainty estimates of the base policy to focus exploration on regions in which the base policy is not confident. Second, we propose a simple modification to off-policy residual learning that allows it to observe base actions and better handle stochastic base policies. We evaluate our method with both Gaussian-based and Diffusion-based stochastic base policies on tasks from Robosuite and D4RL, and compare against state-of-the-art finetuning methods, demo-augmented RL methods, and other resi
Authors
(none)
Tags
Stats
Related papers
- Vmfer: Von Mises-fisher Experience Resampling Based On Uncertainty Of Gradient Directions For Policy Improvement (2024)0.00
- Online Robust Reinforcement Learning With Model Uncertainty (2021)0.00
- How To Enable Uncertainty Estimation In Proximal Policy Optimization (2022)0.00
- Uncertainty Quantification And Exploration For Reinforcement Learning (2019)6.77
- Towards Robust Offline-to-online Reinforcement Learning Via Uncertainty And Smoothness (2023)5.24
- Smart Exploration In Reinforcement Learning Using Bounded Uncertainty Models (2025)0.00
- Deep Model-based Reinforcement Learning Via Estimated Uncertainty And Conservative Policy Optimization (2019)0.00
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00