Scalable Spectral Representations For Multi-agent Reinforcement Learning In Network Mdps
2024 Β· Zhaolin Ren, Runyu Zhang, Bo Dai, et al.
Abstract
Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local \(Q\)-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local \(Q\)-functions.
Authors
(none)
Tags
Stats
Related papers
- Multi-timescale Ensemble Q-learning For Markov Decision Process Policy Optimization (2024)6.34
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Decentralised Q-learning For Multi-agent Markov Decision Processes With A Satisfiability Criterion (2023)0.00
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- Continuous-time Distributed Dynamic Programming For Networked Multi-agent Markov Decision Processes (2023)2.26
- Scalable Reinforcement Learning For Multi-agent Networked Systems (2019)10.35
- Scalable Planning In Multi-agent Mdps (2021)0.00
- Scalable And Sample Efficient Distributed Policy Gradient Algorithms In Multi-agent Networked Systems (2022)0.00