An \(L^2\) Analysis Of Reinforcement Learning In High Dimensions With Kernel And Neural Network Approximation
2021 Β· Jihao Long, Jiequn Han, Weinan E
Abstract
Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives rise to error bounds that involve either the number of states or the number of features. This paper considers the situation where the function approximation is made either using the kernel method or the two-layer neural network model, in the context of a fitted Q-iteration algorithm with explicit regularization. We establish an \(\tilde\{O\}(H^3|\mathcal \{A\}|^\{\frac14\}n^\{-\frac14\})\) bound for the optimal policy with \(Hn\) samples, where \(H\) is the length of each episode and \(|\mathcal \{A\}|\) is the size of action space. Our analysis hinges on analyzing the \(L^2\) error of the approximated Q-function using \(n\) data points. Even though this result still requires a finite-sized action space, the error bound is independent of the dimensiona
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning In Feature Space: Matrix Bandit, Kernels, And Regret Bound (2019)0.00
- A Finite-time Analysis Of Q-learning With Neural Network Function Approximation (2019)0.00
- Infinite-horizon Offline Reinforcement Learning With Linear Function Approximation: Curse Of Dimensionality And Algorithm (2021)0.00
- Sample Complexity Of Offline Reinforcement Learning With Deep Relu Networks (2021)0.00
- Prior-dependent Analysis Of Posterior Sampling Reinforcement Learning With Function Approximation (2024)0.00
- Value Function Approximations Via Kernel Embeddings For No-regret Reinforcement Learning (2020)0.00
- Provably Efficient Reinforcement Learning With Linear Function Approximation (2019)11.76
- Leveraging Unlabeled Data Sharing Through Kernel Function Approximation In Offline Reinforcement Learning (2024)0.00