Predicting Periodicity With Temporal Difference Learning
2018 Β· Kristopher de Asis, Brendan Bennett, Richard S. Sutton
Abstract
Temporal difference (TD) learning is an important approach in reinforcement learning, as it combines ideas from dynamic programming and Monte Carlo methods in a way that allows for online and incremental model-free learning. A key idea of TD learning is that it is learning predictive knowledge about the environment in the form of value functions, from which it can derive its behavior to address long-term sequential decision making problems. The agent's horizon of interest, that is, how immediate or long-term a TD learning agent predicts into the future, is adjusted through a discount rate parameter. In this paper, we introduce an alternative view on the discount rate, with insight from digital signal processing, to include complex-valued discounting. Our results show that setting the discount rate to appropriately chosen complex numbers allows for online and incremental estimation of the Discrete Fourier Transform (DFT) of a signal of interest with TD learning. We thereby extend the ty
Authors
(none)
Tags
Stats
Related papers
- Discerning Temporal Difference Learning (2023)0.00
- Preferential Temporal Difference Learning (2021)0.00
- On The Statistical Benefits Of Temporal Difference Learning (2023)0.00
- Control Theoretic Analysis Of Temporal Difference Learning (2021)0.00
- Prediction And Control In Continual Reinforcement Learning (2023)0.00
- Differential Temporal Difference Learning (2018)5.24
- Stability And Sensitivity Analysis Of Relative Temporal-difference Learning: Extended Version (2026)0.00
- Gradient Iterated Temporal-difference Learning (2026)0.00