Tractable Representations For Convergent Approximation Of Distributional HJB Equations
2025 Β· Julie Alhosh, Harley Wiltzer, David Meger
Abstract
In reinforcement learning (RL), the long-term behavior of decision-making policies is evaluated based on their average returns. Distributional RL has emerged, presenting techniques for learning return distributions, which provide additional statistics for evaluating policies, incorporating risk-sensitive considerations. When the passage of time cannot naturally be divided into discrete time increments, researchers have studied the continuous-time RL (CTRL) problem, where agent states and decisions evolve continuously. In this setting, the Hamilton-Jacobi-Bellman (HJB) equation is well established as the characterization of the expected return, and many solution methods exist. However, the study of distributional RL in the continuous-time setting is in its infancy. Recent work has established a distributional HJB (DHJB) equation, providing the first characterization of return distributions in CTRL. These equations and their solutions are intractable to solve and represent exactly, requi
Authors
(none)
Tags
Stats
Related papers
- Distributional Hamilton-jacobi-bellman Equations For Continuous-time Reinforcement Learning (2022)0.00
- Conjugated Discrete Distributions For Distributional Reinforcement Learning (2021)0.00
- Action Gaps And Advantages In Continuous-time Distributional Reinforcement Learning (2024)0.00
- Distributional Reinforcement Learning With Linear Function Approximation (2019)0.00
- A Comparative Analysis Of Expected And Distributional Reinforcement Learning (2019)9.76
- Distributional Reinforcement Learning For Multi-dimensional Reward Functions (2021)0.00
- A Differential Perspective On Distributional Reinforcement Learning (2025)0.00
- Distributionally Robust Offline Reinforcement Learning With Linear Function Approximation (2022)0.00