Nonparametric Bellman Mappings For Value Iteration In Distributed Reinforcement Learning
2025 Β· Yuki Akiyama, Konstantinos Slavakis
Abstract
This paper introduces novel Bellman mappings (B-Maps) for value iteration (VI) in distributed reinforcement learning (DRL), where agents are deployed over an undirected, connected graph/network with arbitrary topology -- but without a centralized node, that is, a node capable of aggregating all data and performing computations. Each agent constructs a nonparametric B-Map from its private data, operating on Q-functions represented in a reproducing kernel Hilbert space, with flexibility in choosing the basis for their representation. Agents exchange their Q-function estimates only with direct neighbors, and unlike existing DRL approaches that restrict communication to Q-functions, the proposed framework also enables the transmission of basis information in the form of covariance matrices, thereby conveying additional structural details. Linear convergence rates are established for both Q-function and covariance-matrix estimates toward their consensus values, regardless of the network top
Authors
(none)
Tags
Stats
Related papers
- Nonparametric Bellman Mappings For Reinforcement Learning: Application To Robust Adaptive Filtering (2024)6.34
- Proximal Bellman Mappings For Reinforcement Learning And Their Application To Robust Adaptive Filtering (2023)2.26
- Orchestrated Value Mapping For Reinforcement Learning (2022)0.00
- Spectral Bellman Method: Unifying Representation And Exploration In RL (2025)0.00
- Distributed Value Function Approximation For Collaborative Multi-agent Reinforcement Learning (2020)8.60
- Iterated \(q\)-network: Beyond One-step Bellman Updates In Deep Reinforcement Learning (2024)0.00
- Factored Value Functions For Graph-based Multi-agent Reinforcement Learning (2026)0.00
- Distributional Reinforcement Learning For Multi-dimensional Reward Functions (2021)0.00