Abstract

Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing single-timescale actor-critic have been limited to i.i.d. sampling or tabular setting for simplicity. We investigate the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic assumes linear function approximation and updates with a single Markovian sample per actor step. Previous analysis has been unable to establish the convergence for such a challenging scenario. We demonstrate that the online single-timescale actor-critic method provably finds an \(\epsilon\)-approximate stationary point with \(\widetilde\{\mathcal\{O\}\}(\epsilon^\{-2\})\) sample complexity under standard assumptions, which can be further improved to \(\mathcal\{O\}(\epsilon^\{-2\})\) under the i.i.d. sampling. Our novel framework systematica

Authors

(none)

Tags

  • Uncategorized

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keychen2022finite

Related papers