Improving Sample Complexity Bounds For (natural) Actor-critic Algorithms
2020 Β· Tengyu Xu, Zhe Wang, Yingbin Liang
Abstract
The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and natural actor-critic (NAC) algorithms has been established recently, but under independent and identically distributed (i.i.d.) sampling and single-sample update at each iteration. In contrast, this paper characterizes the convergence rate and sample complexity of AC and NAC under Markovian sampling, with mini-batch data for each iteration, and with actor having general policy class approximation. We show that the overall sample complexity for a mini-batch AC to attain an \(\epsilon\)-accurate stationary point improves the best known sample complexity of AC by an order of \(\mathcal\{O\}(\epsilon^\{-1\}log(1/\epsilon))\), and the overall sample complexity for a mini-batch NAC to attain an \(\epsilon\)-accurate globally optimal point improves the existing sample complexity of NAC by an order of \(\mat
Authors
(none)
Tags
Stats
Related papers
- Non-asymptotic Analysis For Single-loop (natural) Actor-critic With Compatible Function Approximation (2024)0.00
- Non-asymptotic Convergence Analysis Of Two Time-scale (natural) Actor-critic Algorithms (2020)0.00
- A Finite Time Analysis Of Two Time-scale Actor Critic Methods (2020)0.00
- Finite Sample Analysis Of Two-time-scale Natural Actor-critic Algorithm (2021)7.50
- Finite-time Analysis Of Fully Decentralized Single-timescale Actor-critic (2022)0.00
- Finite-sample Analysis Of Off-policy Natural Actor-critic With Linear Function Approximation (2021)0.00
- Finite-time Analysis Of Single-timescale Actor-critic (2022)0.00
- On The Sample Complexity Of Actor-critic Method For Reinforcement Learning With Function Approximation (2019)11.49