The Optimal Approximation Factors In Misspecified Off-policy Value Function Estimation
2023 · Philip Amortila, Nan Jiang, Csaba Szepesvári
Abstract
Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation. Yet, the nature of such *approximation factors* -- especially their optimal form in a given learning problem -- is poorly understood. In this paper we study this question in linear off-policy value function estimation, where many open questions remain. We study the approximation factor in a broad spectrum of settings, such as with the weighted \(L_2\)-norm (where the weighting is the offline state distribution), the \(L_\infty\) norm, the presence vs. absence of state aliasing, and full vs. partial coverage of the state space. We establish the optimal asymptotic approximation factors (up to constants) for all of these settings. In particular, our bounds identify two instance-dependent factors for the \(L_2(\mu)\) norm and only one for the \(L_\infty\) norm, which are shown to dictate the hardness of off-policy eva
Authors
(none)
Tags
Stats
Related papers
- On The Model-misspecification In Reinforcement Learning (2023)0.00
- Minimax-optimal Off-policy Evaluation With Linear Function Approximation (2020)0.00
- Minimax Optimal And Computationally Efficient Algorithms For Distributionally Robust Offline Reinforcement Learning (2024)0.00
- Offline Reinforcement Learning: Fundamental Barriers For Value Function Approximation (2021)0.00
- Distributionally Robust Offline Reinforcement Learning With Linear Function Approximation (2022)0.00
- The Role Of Lookahead And Approximate Policy Evaluation In Reinforcement Learning With Linear Value Function Approximation (2021)0.00
- Provably Efficient Reinforcement Learning With Linear Function Approximation (2019)11.76
- Variance-aware Off-policy Evaluation With Linear Function Approximation (2021)0.00