TAPE: Leveraging Agent Topology For Cooperative Multi-agent Policy Gradient
2023 Β· Xingzhou Lou, Junge Zhang, Timothy J. Norman, et al.
Abstract
Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation among agents. To address this issue, we propose an agent topology framework, which decides whether other agents should be considered in policy gradient and achieves compromise between facilitating cooperation and alleviating the CDM issue. The agent topology allows agents to use coalition utility as learning objective instead of global utility by centralized critics or local utility by individual critics. To constitute the agent topology, various models are studied. We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods. We prove the po
Authors
(none)
Tags
Stats
Related papers
- Asynchronous, Option-based Multi-agent Policy Gradient: A Conditional Reasoning Approach (2022)0.00
- Optimistic Multi-agent Policy Gradient (2023)0.00
- Learning Explicit Credit Assignment For Cooperative Multi-agent Reinforcement Learning Via Polarization Policy Gradient (2022)4.52
- Settling The Variance Of Multi-agent Policy Gradients (2021)0.00
- Counterfactual Multi-agent Policy Gradients (2017)0.00
- A Policy Gradient Algorithm For Learning To Learn In Multiagent Reinforcement Learning (2020)0.00
- Multi-agent Guided Policy Optimization (2025)0.00
- Descent-guided Policy Gradient For Scalable Cooperative Multi-agent Learning (2026)0.00