Characterizing The Exact Behaviors Of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory
2019 Β· Bin Hu, Usman Ahmed Syed
Abstract
In this paper, we provide a unified analysis of temporal difference learning algorithms with linear function approximators by exploiting their connections to Markov jump linear systems (MJLS). We tailor the MJLS theory developed in the control community to characterize the exact behaviors of the first and second order moments of a large family of temporal difference learning algorithms. For both the IID and Markov noise cases, we show that the evolution of some augmented versions of the mean and covariance matrix of the TD estimation error exactly follows the trajectory of a deterministic linear time-invariant (LTI) dynamical system. Applying the well-known LTI system theory, we obtain closed-form expressions for the mean and covariance matrix of the TD estimation error at any time step. We provide a tight matrix spectral radius condition to guarantee the convergence of the covariance matrix of the TD estimation error, and perform a perturbation analysis to characterize the dependence
Authors
(none)
Tags
Stats
Related papers
- Exact Formulas For Finite-time Estimation Errors Of Decentralized Temporal Difference Learning With Linear Function Approximation (2022)0.00
- Control Theoretic Analysis Of Temporal Difference Learning (2021)0.00
- A Finite Time Analysis Of Temporal Difference Learning With Linear Function Approximation (2018)0.00
- Uncertainty Quantification For Markov Chain Induced Martingales With Application To Temporal Difference Learning (2025)0.00
- Differential Temporal Difference Learning (2018)5.24
- Finite-time Performance Of Distributed Temporal Difference Learning With Linear Function Approximation (2019)9.59
- Temporal-difference Learning With Nonlinear Function Approximation: Lazy Training And Mean Field Regimes (2019)0.00
- Central Limit Theorem For Two-timescale Stochastic Approximation With Markovian Noise: Theory And Applications (2024)0.00