Breaking The Sample Size Barrier In Model-based Reinforcement Learning With A Generative Model
2020 Β· Gen Li, Yuting Wei, Yuejie Chi, et al.
Abstract
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator). We first consider \(\gamma\)-discounted infinite-horizon Markov decision processes (MDPs) with state space \(\mathcal\{S\}\) and action space \(\mathcal\{A\}\). Despite a number of prior works tackling this problem, a complete picture of the trade-offs between sample complexity and statistical accuracy is yet to be determined. In particular, all prior results suffer from a severe sample size barrier, in the sense that their claimed statistical guarantees hold only when the sample size exceeds at least \(\frac\{|\mathcal\{S\}||\mathcal\{A\}|\}\{(1-\gamma)^2\}\). The current paper overcomes this barrier by certifying the minimax optimality of two algorithms -- a perturbed model-based algorithm and a conservative model-based algorithm -- as soon as the sample size exceeds the order of \(\frac\{|\mathcal\{S\}||\mathcal\{A\}|\}\{1-\gamma\}\) (modulo some log fa
Authors
(none)
Tags
Stats
Related papers
- Sample-efficient Reinforcement Learning For Linearly-parameterized Mdps With A Generative Model (2021)0.00
- Model-based Reinforcement Learning With A Generative Model Is Minimax Optimal (2019)0.00
- Model-free Reinforcement Learning: From Clipped Pseudo-regret To Sample Complexity (2020)0.00
- The Curious Price Of Distributional Robustness In Reinforcement Learning With A Generative Model (2023)0.00
- How Does An Approximate Model Help In Reinforcement Learning? (2019)0.00
- Entropic Risk Optimization In Discounted Mdps: Sample Complexity Bounds With A Generative Model (2025)0.00
- Quantum Algorithms For Reinforcement Learning With A Generative Model (2021)0.00
- Sample Complexity Of Robust Reinforcement Learning With A Generative Model (2021)0.00