New Versions Of Gradient Temporal Difference Learning
2021 Β· Donghwan Lee, Han-Dong Lim, Jihoon Park, et al.
Abstract
Sutton, Szepesv\'\{a\}ri and Maei introduced the first gradient temporal-difference (GTD) learning algorithms compatible with both linear function approximation and off-policy training. The goal of this paper is (a) to propose some variants of GTDs with extensive comparative analysis and (b) to establish new theoretical analysis frameworks for the GTDs. These variants are based on convex-concave saddle-point interpretations of GTDs, which effectively unify all the GTDs into a single framework, and provide simple stability analysis based on recent results on primal-dual gradient dynamics. Finally, numerical comparative analysis is given to evaluate these approaches.
Authors
(none)
Tags
Stats
Related papers
- Regularized Gradient Temporal-difference Learning (2026)0.00
- Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning With Polynomial Sample Complexity (2020)5.84
- Finite-sample Analysis Of Proximal Gradient TD Algorithms (2020)0.00
- Revisiting A Design Choice In Gradient Temporal Difference Learning (2023)0.00
- Nonlinear Distributional Gradient Temporal-difference Learning (2018)0.00
- Backstepping Temporal Difference Learning (2023)0.00
- Gradient Iterated Temporal-difference Learning (2026)0.00
- O\(^2\)TD: (near)-optimal Off-policy TD Learning (2017)0.00