Neural Network Approximation For Pessimistic Offline Reinforcement Learning
2023 Β· di Wu, Yuling Jiao, Li Shen, et al.
Abstract
Deep reinforcement learning (RL) has shown remarkable success in specific offline decision-making scenarios, yet its theoretical guarantees are still under development. Existing works on offline RL theory primarily emphasize a few trivial settings, such as linear MDP or general function approximation with strong assumptions and independent data, which lack guidance for practical use. The coupling of deep learning and Bellman residuals makes this problem challenging, in addition to the difficulty of data dependence. In this paper, we establish a non-asymptotic estimation error of pessimistic offline RL using general neural network approximation with \(\mathcal\{C\}\)-mixing data regarding the structure of networks, the dimension of datasets, and the concentrability of data coverage, under mild assumptions. Our result shows that the estimation error consists of two parts: the first converges to zero at a desired rate on the sample size with partially controllable concentrability, and the
Authors
(none)
Tags
Stats
Related papers
- Is Pessimism Provably Efficient For Offline RL? (2020)0.00
- Optimal Conservative Offline RL With General Function Approximation Via Augmented Lagrangian (2022)0.00
- Viper: Provably Efficient Algorithm For Offline RL With Neural Function Approximation (2023)0.00
- Bellman-consistent Pessimism For Offline Reinforcement Learning (2021)0.00
- Sample Complexity Of Offline Reinforcement Learning With Deep Relu Networks (2021)0.00
- Nearly Minimax Optimal Offline Reinforcement Learning With Linear Function Approximation: Single-agent MDP And Markov Game (2022)0.00
- Pessimistic Nonlinear Least-squares Value Iteration For Offline Reinforcement Learning (2023)0.00
- Bridging Offline Reinforcement Learning And Imitation Learning: A Tale Of Pessimism (2021)0.00