Online Robust Reinforcement Learning With Model Uncertainty
2021 Β· Yue Wang, Shaofeng Zou
Abstract
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a misspecified MDP that generates a single sample trajectory sequentially and is assumed to be unknown. We develop a sample-based approach to estimate the unknown uncertainty set and design a robust Q-learning algorithm (tabular case) and robust TDC algorithm (function approximation setting), which can be implemented in an online and incremental fashion. For the robust Q-learning algorithm, we prove that it converges to the optimal robust Q function, and for the robust TDC algorithm, we prove that it converges asymptotically to some stationary points. Unlike the results in [Roy et al., 2017], our algorithms do not need any additional conditions on the discount factor to guarantee the convergence. We further characterize the finite-time error bounds of the
Authors
(none)
Tags
Stats
Related papers
- On Practical Robust Reinforcement Learning: Practical Uncertainty Set And Double-agent Algorithm (2023)3.58
- Distributionally Robust Model-based Offline Reinforcement Learning With Near-optimal Sample Complexity (2022)0.00
- Sample Complexity Of Robust Reinforcement Learning With A Generative Model (2021)0.00
- Reinforcement Learning Under Model Mismatch (2017)0.00
- The Curious Price Of Distributional Robustness In Reinforcement Learning With A Generative Model (2023)0.00
- Smart Exploration In Reinforcement Learning Using Bounded Uncertainty Models (2025)0.00
- Towards Robust Offline-to-online Reinforcement Learning Via Uncertainty And Smoothness (2023)5.24
- Combining Pessimism With Optimism For Robust And Efficient Model-based Deep Reinforcement Learning (2021)0.00