Communication-efficient Actor-critic Methods For Homogeneous Markov Games
2022 Β· Dingyang Chen, Yile Li, Qi Zhang
Abstract
Recent success in cooperative multi-agent reinforcement learning (MARL) relies on centralized training and policy sharing. Centralized training eliminates the issue of non-stationarity MARL yet induces large communication costs, and policy sharing is empirically crucial to efficient learning in certain tasks yet lacks theoretical justification. In this paper, we formally characterize a subclass of cooperative Markov games where agents exhibit a certain form of homogeneity such that policy sharing provably incurs no suboptimality. This enables us to develop the first consensus-based decentralized actor-critic method where the consensus update is applied to both the actors and the critics while ensuring convergence. We also develop practical algorithms based on our decentralized actor-critic method to reduce the communication cost during training, while still yielding policies comparable with centralized training.
Authors
(none)
Tags
Stats
Related papers
- Convergence Of Decentralized Actor-critic Algorithm In General-sum Markov Games (2024)3.58
- F2A2: Flexible Fully-decentralized Approximate Actor-critic For Cooperative Multi-agent Reinforcement Learning (2020)0.00
- On Centralized Critics In Multi-agent Reinforcement Learning (2024)9.03
- Multi-agent Actor-critic For Mixed Cooperative-competitive Environments (2017)0.00
- Context-aware Bayesian Network Actor-critic Methods For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Bi-level Actor-critic For Multi-agent Coordination (2019)0.00
- Contrasting Centralized And Decentralized Critics In Multi-agent Reinforcement Learning (2021)0.00
- Convergence Rates For Localized Actor-critic In Networked Markov Potential Games (2023)0.00