Sample Complexity Of Variance-reduced Distributionally Robust Q-learning
2023 Β· Shengbo Wang, Nian Si, Jose Blanchet, et al.
Abstract
Dynamic decision-making under distributional shifts is of fundamental interest in theory and applications of reinforcement learning: The distribution of the environment in which the data is collected can differ from that of the environment in which the model is deployed. This paper presents two novel model-free algorithms, namely the distributionally robust Q-learning and its variance-reduced counterpart, that can effectively learn a robust policy despite distributional shifts. These algorithms are designed to efficiently approximate the \(q\)-function of an infinite-horizon \(\gamma\)-discounted robust Markov decision process with Kullback-Leibler ambiguity set to an entry-wise \(\epsilon\)-degree of precision. Further, the variance-reduced distributionally robust Q-learning combines the synchronous Q-learning with variance-reduction techniques to enhance its performance. Consequently, we establish that it attains a minimax sample complexity upper bound of \(\tilde O(|\mathbf\{S\}||\m
Authors
(none)
Tags
Stats
Related papers
- The Curious Price Of Distributional Robustness In Reinforcement Learning With A Generative Model (2023)0.00
- Distributionally Robust Model-based Reinforcement Learning With Large State Spaces (2023)0.00
- Q-learning With Uniformly Bounded Variance: Large Discounting Is Not A Barrier To Fast Learning (2020)0.00
- Sample Complexity Of Offline Distributionally Robust Linear Markov Decision Processes (2024)0.00
- Sample Complexity Of Asynchronous Q-learning: Sharper Analysis And Variance Reduction (2020)11.19
- Sample Complexity Of Distributionally Robust Average-reward Reinforcement Learning (2025)0.00
- Sample Complexity Of Robust Reinforcement Learning With A Generative Model (2021)0.00
- Variance-reduced Cascade Q-learning: Algorithms And Sample Complexity (2024)0.00