Decomposing Communication Gain And Delay Cost Under Cross-timestep Delays In Cooperative Multi-agent Reinforcement Learning
2026 Β· Zihong Gao, Hongjian Liang, Lei Hao, et al.
Abstract
Communication is essential for coordination in *cooperative* multi-agent reinforcement learning under partial observability, yet *cross-timestep* delays cause messages to arrive multiple timesteps after generation, inducing temporal misalignment and making information stale when consumed. We formalize this setting as a delayed-communication partially observable Markov game (DeComm-POMG) and decompose a message's effect into *communication gain* and *delay cost*, yielding the Communication Gain and Delay Cost (CGDC) metric. We further establish a value-loss bound showing that the degradation induced by delayed messages is upper-bounded by a discounted accumulation of an information gap between the action distributions induced by timely versus delayed messages. Guided by CGDC, we propose \textbf\{CDCMA\}, an actor--critic framework that requests messages only when predicted CGDC is positive, predicts future observations to reduce misalignment at consumption, and fuses delayed messa
Authors
(none)
Tags
Stats
Related papers
- DACOM: Learning Delay-aware Communication For Multi-agent Reinforcement Learning (2022)0.00
- Delay-aware Multi-agent Reinforcement Learning For Cooperative And Competitive Environments (2020)0.00
- Rgmcomm: Return Gap Minimization Via Discrete Communications In Multi-agent Reinforcement Learning (2023)6.77
- Multi-agent Reinforcement Learning With Communication-constrained Priors (2025)0.00
- Effective Communications: A Joint Learning And Communication Framework For Multi-agent Reinforcement Learning Over Noisy Channels (2021)0.00
- PAC Guarantees For Cooperative Multi-agent Reinforcement Learning With Restricted Communication (2019)0.00
- Provably Efficient Multi-agent Reinforcement Learning With Fully Decentralized Communication (2021)0.00
- Communicating Plans, Not Percepts: Scalable Multi-agent Coordination With Embodied World Models (2025)0.00