Continuous-time Value Iteration For Multi-agent Reinforcement Learning
2025 Β· Xuefeng Wang, Lei Zhang, Henglin Pu, et al.
Abstract
Existing reinforcement learning (RL) methods struggle with complex dynamical systems that demand interactions at high frequencies or irregular time intervals. Continuous-time RL (CTRL) has emerged as a promising alternative by replacing discrete-time Bellman recursion with differential value functions defined as viscosity solutions of the Hamilton--Jacobi--Bellman (HJB) equation. While CTRL has shown promise, its applications have been largely limited to the single-agent domain. This limitation stems from two key challenges: (i) conventional solution methods for HJB equations suffer from the curse of dimensionality (CoD), making them intractable in high-dimensional systems; and (ii) even with HJB-based learning approaches, accurately approximating centralized value functions in multi-agent settings remains difficult, which in turn destabilizes policy training. In this paper, we propose a CT-MARL framework that uses physics-informed neural networks (PINNs) to approximate HJB-based value
Authors
(none)
Tags
Stats
Related papers
- Continuous-time Value Function Approximation In Reproducing Kernel Hilbert Spaces (2018)0.00
- Continuous Time Continuous Space Homeostatic Reinforcement Learning (CTCS-HRRL) : Towards Biological Self-autonomous Agent (2024)0.00
- Managing Temporal Resolution In Continuous Value Estimation: A Fundamental Trade-off (2022)0.00
- Deep RL With Information Constrained Policies: Generalization In Continuous Control (2020)0.00
- Demystifying Reinforcement Learning In Time-varying Systems (2022)0.00
- Tractable Representations For Convergent Approximation Of Distributional HJB Equations (2025)0.00
- Centralized Cooperative Exploration Policy For Continuous Control Tasks (2023)0.00
- Sample And Computationally Efficient Continuous-time Reinforcement Learning With General Function Approximation (2025)0.00