Context-aware Bayesian Network Actor-critic Methods For Cooperative Multi-agent Reinforcement Learning
2023 Β· Dingyang Chen, Qi Zhang
Abstract
Executing actions in a correlated manner is a common strategy for human coordination that often leads to better cooperation, which is also potentially beneficial for cooperative multi-agent reinforcement learning (MARL). However, the recent success of MARL relies heavily on the convenient paradigm of purely decentralized execution, where there is no action correlation among agents for scalability considerations. In this work, we introduce a Bayesian network to inaugurate correlations between agents' action selections in their joint policy. Theoretically, we establish a theoretical justification for why action dependencies are beneficial by deriving the multi-agent policy gradient formula under such a Bayesian network joint policy and proving its global convergence to Nash equilibria under tabular softmax policy parameterization in cooperative Markov games. Further, by equipping existing MARL algorithms with a recent method of differentiable directed acyclic graphs (DAGs), we develop pr
Authors
(none)
Tags
Stats
Related papers
- Learning To Coordinate In Multi-agent Systems: A Coordinated Actor-critic Algorithm And Finite-time Guarantees (2021)0.00
- Fully Decentralized Multi-agent Reinforcement Learning With Networked Agents (2018)0.00
- Bi-level Actor-critic For Multi-agent Coordination (2019)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Multi-agent Actor-critic For Mixed Cooperative-competitive Environments (2017)0.00
- Actor-critic Algorithms For Constrained Multi-agent Reinforcement Learning (2019)0.00
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- MARL With General Utilities Via Decentralized Shadow Reward Actor-critic (2021)0.00