Temporal Difference Flows
2025 Β· Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, et al.
Abstract
Predictive models of the future are fundamental for an agent's ability to reason and plan. A common strategy learns a world model and unrolls it step-by-step at inference, where small errors can rapidly compound. Geometric Horizon Models (GHMs) offer a compelling alternative by directly making predictions of future states, avoiding cumulative inference errors. While GHMs can be conveniently learned by a generative analog to temporal difference (TD) learning, existing methods are negatively affected by bootstrapping predictions at train time and struggle to generate high-quality predictions at long horizons. This paper introduces Temporal Difference Flows (TD-Flow), which leverages the structure of a novel Bellman equation on probability paths alongside flow-matching techniques to learn accurate GHMs at over 5x the horizon length of prior methods. Theoretically, we establish a new convergence result and primarily attribute TD-Flow's efficacy to reduced gradient variance during training.
Authors
(none)
Tags
Stats
Related papers
- Generative Temporal Difference Learning For Infinite-horizon Prediction (2020)0.00
- Gradient Iterated Temporal-difference Learning (2026)0.00
- Learning Gflownets From Partial Episodes For Improved Convergence And Stability (2022)0.00
- On The Statistical Benefits Of Temporal Difference Learning (2023)0.00
- Preferential Temporal Difference Learning (2021)0.00
- Discerning Temporal Difference Learning (2023)0.00
- Differential Temporal Difference Learning (2018)5.24
- Predicting Periodicity With Temporal Difference Learning (2018)0.00