Finite Sample Analysis Of Two-timescale Stochastic Approximation With Applications To Reinforcement Learning
2017 Β· Gal Dalal, Balazs Szorenyi, Gugan Thoppe, et al.
Abstract
Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample analysis. Using this, we provide a concentration bound, which is the first such result for a two-timescale SA. The type of bound we obtain is known as `lock-in probability'. We also introduce a new projection scheme, in which the time between successive projections increases exponentially. This scheme allows one to elegantly transform a lock-in probability into a convergence rate result for projected two-timescale SA. From this latter result, we then extract key insights on stepsize selection. As an application, we finally obtain convergence rates for the projected two-timescale RL algorithms GTD(0), GTD2, and TDC.
Authors
(none)
Tags
Stats
Related papers
- Finite Time Analysis Of Linear Two-timescale Stochastic Approximation With Markovian Noise (2020)0.00
- Finite-sample Analysis Of Nonlinear Stochastic Approximation With Applications In Reinforcement Learning (2019)10.35
- Sample Complexity Bounds For Two Timescale Value-based Reinforcement Learning Algorithms (2020)0.00
- Finite-time Performance Bounds And Adaptive Learning Rate Selection For Two Time-scale Reinforcement Learning (2019)0.00
- Central Limit Theorem For Two-timescale Stochastic Approximation With Markovian Noise: Theory And Applications (2024)0.00
- Stochastic Approximation With Delayed Updates: Finite-time Rates Under Markovian Sampling (2024)0.00
- Finite-sample Analysis Of Stochastic Approximation Using Smooth Convex Envelopes (2020)0.00
- A Tale Of Two-timescale Reinforcement Learning With The Tightest Finite-time Bound (2019)0.00