Fast Two-time-scale Stochastic Gradient Method With Applications In Reinforcement Learning
2024 Β· Sihan Zeng, Thinh T. Doan
Abstract
Two-time-scale optimization is a framework introduced in Zeng et al. (2024) that abstracts a range of policy evaluation and policy optimization problems in reinforcement learning (RL). Akin to bi-level optimization under a particular type of stochastic oracle, the two-time-scale optimization framework has an upper level objective whose gradient evaluation depends on the solution of a lower level problem, which is to find the root of a strongly monotone operator. In this work, we propose a new method for solving two-time-scale optimization that achieves significantly faster convergence than the prior arts. The key idea of our approach is to leverage an averaging step to improve the estimates of the operators in both lower and upper levels before using them to update the decision variables. These additional averaging steps eliminate the direct coupling between the main variables, enabling the accelerated performance of our algorithm. We characterize the finite-time convergence rates of t
Authors
(none)
Tags
Stats
Related papers
- Sample Complexity Bounds For Two Timescale Value-based Reinforcement Learning Algorithms (2020)0.00
- A Tale Of Two-timescale Reinforcement Learning With The Tightest Finite-time Bound (2019)0.00
- Single-timescale Stochastic Nonconvex-concave Optimization For Smooth Nonlinear TD Learning (2020)0.00
- Quantile-based Deep Reinforcement Learning Using Two-timescale Policy Gradient Algorithms (2023)0.00
- Finite Sample Analysis Of Two-timescale Stochastic Approximation With Applications To Reinforcement Learning (2017)0.00
- Finite-time Performance Bounds And Adaptive Learning Rate Selection For Two Time-scale Reinforcement Learning (2019)0.00
- Policy Optimization For Continuous Reinforcement Learning (2023)2.26
- Central Limit Theorem For Two-timescale Stochastic Approximation With Markovian Noise: Theory And Applications (2024)0.00