Rgmcomm: Return Gap Minimization Via Discrete Communications In Multi-agent Reinforcement Learning
2023 Β· Jingdi Chen, Tian Lan, Carlee Joe-Wong
Abstract
Communication is crucial for solving cooperative Multi-Agent Reinforcement Learning tasks in partially observable Markov Decision Processes. Existing works often rely on black-box methods to encode local information/features into messages shared with other agents, leading to the generation of continuous messages with high communication overhead and poor interpretability. Prior attempts at discrete communication methods generate one-hot vectors trained as part of agents' actions and use the Gumbel softmax operation for calculating message gradients, which are all heuristic designs that do not provide any quantitative guarantees on the expected return. This paper establishes an upper bound on the return gap between an ideal policy with full observability and an optimal partially observable policy with discrete communication. This result enables us to recast multi-agent communication into a novel online clustering problem over the local observations at each agent, with messages as cluster
Authors
(none)
Tags
Stats
Related papers
- An Analysis Of Discretization Methods For Communication Learning With Multi-agent Reinforcement Learning (2022)0.00
- Minimizing Communication While Maximizing Performance In Multi-agent Reinforcement Learning (2021)0.00
- An In-depth Analysis Of Discretization Methods For Communication Learning Using Backpropagation With Multi-agent Reinforcement Learning (2023)0.00
- Multi-agent Reinforcement Learning With Communication-constrained Priors (2025)0.00
- Learning Emergent Discrete Message Communication For Cooperative Reinforcement Learning (2021)5.24
- Learning What To Say And How Precisely: Efficient Communication Via Differentiable Discrete Communication Learning (2025)0.00
- Decomposing Communication Gain And Delay Cost Under Cross-timestep Delays In Cooperative Multi-agent Reinforcement Learning (2026)0.00
- Provably Efficient Multi-agent Reinforcement Learning With Fully Decentralized Communication (2021)0.00