A Finite Sample Analysis Of Distributional TD Learning With Linear Function Approximation
2025 Β· Yang Peng, Kaicheng Jin, Liangyu Zhang, et al.
Abstract
In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The aim of distributional TD learning is to estimate the return distribution of a discounted Markov decision process for a given policy \{\pi\}. Previous works on statistical analysis of distributional TD learning mainly focus on the tabular case. In contrast, we first consider the linear function approximation setting and derive sharp finite-sample rates. Our theoretical results demonstrate that the sample complexity of linear distributional TD learning matches that of classic linear TD learning. This implies that, with linear function approximation, learning the full distribution of the return from streaming data is no more difficult than learning its expectation (value function). To derive tight sample complexity bounds, we conduct a fine-grained analysis of the linear-categorical Bellman equation and employ the exponential stability arg
Authors
(none)
Tags
Stats
Related papers
- Accelerated Distributional Temporal Difference Learning With Linear Function Approximation (2025)0.00
- A Finite Time Analysis Of Temporal Difference Learning With Linear Function Approximation (2018)0.00
- Finite-time Performance Of Distributed Temporal Difference Learning With Linear Function Approximation (2019)9.59
- Finite Sample Analysis Of Linear Temporal Difference Learning With Arbitrary Features (2025)0.00
- High-probability Sample Complexities For Policy Evaluation With Linear Function Approximation (2023)0.00
- Adaptive Temporal Difference Learning With Linear Function Approximation (2020)0.00
- Finite-sample Analysis Of Decentralized Temporal-difference Learning With Linear Function Approximation (2019)0.00
- Stability And Sensitivity Analysis Of Relative Temporal-difference Learning: Extended Version (2026)0.00