Finite Sample Analysis Of Linear Temporal Difference Learning With Arbitrary Features
2025 Β· Zixuan Xie, Xinyu Liu, Rohan Chandra, et al.
Abstract
Linear TD(\(\lambda\)) is one of the most fundamental reinforcement learning algorithms for policy evaluation. Previously, convergence rates are typically established under the assumption of linearly independent features, which does not hold in many practical scenarios. This paper instead establishes the first \(L^2\) convergence rates for linear TD(\(\lambda\)) operating under arbitrary features, without making any algorithmic modification or additional assumptions. Our results apply to both the discounted and average-reward settings. To address the potential non-uniqueness of solutions resulting from arbitrary features, we develop a novel stochastic approximation result featuring convergence rates to the solution set instead of a single point.
Authors
(none)
Tags
Stats
Related papers
- Adaptive Temporal Difference Learning With Linear Function Approximation (2020)0.00
- A Finite Time Analysis Of Temporal Difference Learning With Linear Function Approximation (2018)0.00
- A Finite Sample Analysis Of Distributional TD Learning With Linear Function Approximation (2025)0.00
- Accelerated Distributional Temporal Difference Learning With Linear Function Approximation (2025)0.00
- Finite-time Performance Of Distributed Temporal Difference Learning With Linear Function Approximation (2019)9.59
- Adaptive Lambda Least-squares Temporal Difference Learning (2016)0.00
- Analysis Of Off-policy \(n\)-step Td-learning With Linear Function Approximation (2025)0.00
- Geometric Insights Into The Convergence Of Nonlinear TD Learning (2019)0.00