Contrastive Identity-aware Learning For Multi-agent Value Decomposition
2022 Β· Shunyu Liu, Yihe Zhou, Jie Song, et al.
Abstract
Value Decomposition (VD) aims to deduce the contributions of agents for decentralized policies in the presence of only global rewards, and has recently emerged as a powerful credit assignment paradigm for tackling cooperative Multi-Agent Reinforcement Learning (MARL) problems. One of the main challenges in VD is to promote diverse behaviors among agents, while existing methods directly encourage the diversity of learned agent networks with various strategies. However, we argue that these dedicated designs for agent networks are still limited by the indistinguishable VD network, leading to homogeneous agent behaviors and thus downgrading the cooperation capability. In this paper, we propose a novel Contrastive Identity-Aware learning (CIA) method, explicitly boosting the credit-level distinguishability of the VD network to break the bottleneck of multi-agent diversity. Specifically, our approach leverages contrastive learning to maximize the mutual information between the temporal credi
Authors
(none)
Tags
Stats
Related papers
- Adaptive Value Decomposition With Greedy Marginal Contribution Computation For Cooperative Multi-agent Reinforcement Learning (2023)3.58
- Dual Self-awareness Value Decomposition Framework Without Individual Global Max For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- SVDE: Scalable Value-decomposition Exploration For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Heterogeneous Value Decomposition Policy Fusion For Multi-agent Cooperation (2025)0.00
- VDFD: Multi-agent Value Decomposition Framework With Disentangled World Model (2023)0.00
- Privacy-engineered Value Decomposition Networks For Cooperative Multi-agent Reinforcement Learning (2023)4.52
- MMD-MIX: Value Function Factorisation With Maximum Mean Discrepancy For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Modeling The Interaction Between Agents In Cooperative Multi-agent Reinforcement Learning (2021)0.00