Risk Bounds And Rademacher Complexity In Batch Reinforcement Learning
2021 Β· Yaqi Duan, Chi Jin, Zhiyuan Li
Abstract
This paper considers batch Reinforcement Learning (RL) with general value function approximation. Our study investigates the minimal assumptions to reliably estimate/minimize Bellman error, and characterizes the generalization performance by (local) Rademacher complexities of general function classes, which makes initial steps in bridging the gap between statistical learning theory and batch RL. Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class. (2) In the single sampling regime, sample-efficient risk minimization is not possible without further assumptions, regardless of algorithms. However, with completeness assumptions, the excess risk of FQI and a minimax style algorithm can be again bounded by the Rademacher complexity of the corresponding function classes. (3) Fast statistical rates
Authors
(none)
Tags
Stats
Related papers
- Exponential Bellman Equation And Improved Regret Bounds For Risk-sensitive Reinforcement Learning (2021)0.00
- Exponential Lower Bounds For Batch Reinforcement Learning: Batch RL Can Be Exponentially Harder Than Online RL (2020)0.00
- Q* Approximation Schemes For Batch Reinforcement Learning: A Theoretical Comparison (2020)0.00
- Uniform-pac Bounds For Reinforcement Learning With Linear Function Approximation (2021)0.00
- Prior-dependent Analysis Of Posterior Sampling Reinforcement Learning With Function Approximation (2024)0.00
- Online Bayesian Risk-averse Reinforcement Learning (2025)0.00
- A Nearly Optimal And Low-switching Algorithm For Reinforcement Learning With General Function Approximation (2023)0.00
- Strategically Robust Multi-agent Reinforcement Learning With Linear Function Approximation (2026)0.00