What Are The Statistical Limits Of Offline RL With Linear Function Approximation?
2020 Β· Ruosong Wang, Dean P. Foster, Sham M. Kakade
Abstract
Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation methods (to deal with the curse of dimensionality) can provide a means to help alleviate the excessive sample complexity burden in modern sequential decision making problems. However, the extent to which this broader approach can be effective is not well understood, where the literature largely consists of sufficient conditions. This work focuses on the basic question of what are necessary representational and distributional conditions that permit provable sample-efficient offline reinforcement learning. Perhaps surprisingly, our main result shows that even if: i) we have realizability in that the true value function of *every* policy is linear in a given set of features and 2) our off-policy data has good coverage over all features (under a strong spect
Authors
(none)
Tags
Stats
Related papers
- Distributionally Robust Offline Reinforcement Learning With Linear Function Approximation (2022)0.00
- Offline Reinforcement Learning: Fundamental Barriers For Value Function Approximation (2021)0.00
- Optimal Conservative Offline RL With General Function Approximation Via Augmented Lagrangian (2022)0.00
- Infinite-horizon Offline Reinforcement Learning With Linear Function Approximation: Curse Of Dimensionality And Algorithm (2021)0.00
- Minimax Optimal And Computationally Efficient Algorithms For Distributionally Robust Offline Reinforcement Learning (2024)0.00
- Sample Complexity Of Offline Reinforcement Learning With Deep Relu Networks (2021)0.00
- A Complete Characterization Of Linear Estimators For Offline Policy Evaluation (2022)0.00
- Nearly Minimax Optimal Offline Reinforcement Learning With Linear Function Approximation: Single-agent MDP And Markov Game (2022)0.00