Distributionally Robust Offline Reinforcement Learning With Linear Function Approximation
2022 Β· Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, et al.
Abstract
Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e.g., a simulator). This paper attempts to address these issues simultaneously with distributionally robust offline RL, where we learn a distributionally robust policy using historical data obtained from the source environment by optimizing against a worst-case perturbation thereof. In particular, we move beyond tabular settings and consider linear function approximation. More specifically, we consider two settings, one where the dataset is well-explored and the other where the dataset has sufficient coverage of the optimal policy. We propose two algorithms~-- one for each of the two settings~-- that achieve error bounds \(\tilde\{O\}(d^\{1/2\}/N^\{1/2\})\) and \(\tilde\{O\}(d^\{3/2\}/N^\{1/2\})\) respectively, where \(d\) is th
Authors
(none)
Tags
Stats
Related papers
- Minimax Optimal And Computationally Efficient Algorithms For Distributionally Robust Offline Reinforcement Learning (2024)0.00
- What Are The Statistical Limits Of Offline RL With Linear Function Approximation? (2020)0.00
- Distributionally Robust Off-dynamics Reinforcement Learning: Provable Efficiency With Linear Function Approximation (2024)0.00
- Distributionally Robust Online Markov Game With Linear Function Approximation (2025)0.00
- Bridging Distributionally Robust Learning And Offline RL: An Approach To Mitigate Distribution Shift And Partial Data Coverage (2023)0.00
- Distributionally Robust Model-based Offline Reinforcement Learning With Near-optimal Sample Complexity (2022)0.00
- Optimal Conservative Offline RL With General Function Approximation Via Augmented Lagrangian (2022)0.00
- Nearly Minimax Optimal Offline Reinforcement Learning With Linear Function Approximation: Single-agent MDP And Markov Game (2022)0.00