Heterogeneous Value Decomposition Policy Fusion For Multi-agent Cooperation
2025 Β· Siying Wang, Yang Zhou, Zhitong Zhao, et al.
Abstract
Value decomposition (VD) has become one of the most prominent solutions in cooperative multi-agent reinforcement learning. Most existing methods generally explore how to factorize the joint value and minimize the discrepancies between agent observations and characteristics of environmental states. However, direct decomposition may result in limited representation or difficulty in optimization. Orthogonal to designing a new factorization scheme, in this paper, we propose Heterogeneous Policy Fusion (HPF) to integrate the strengths of various VD methods. We construct a composite policy set to select policies for interaction adaptively. Specifically, this adaptive mechanism allows agents' trajectories to benefit from diverse policy transitions while incorporating the advantages of each factorization method. Additionally, HPF introduces a constraint between these heterogeneous policies to rectify the misleading update caused by the unexpected exploratory or suboptimal non-cooperation. Expe
Authors
(none)
Tags
Stats
Related papers
- Dual Self-awareness Value Decomposition Framework Without Individual Global Max For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Understanding Value Decomposition Algorithms In Deep Cooperative Multi-agent Reinforcement Learning (2022)0.00
- VDFD: Multi-agent Value Decomposition Framework With Disentangled World Model (2023)0.00
- SVDE: Scalable Value-decomposition Exploration For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Contrastive Identity-aware Learning For Multi-agent Value Decomposition (2022)9.41
- More Centralized Training, Still Decentralized Execution: Multi-agent Conditional Policy Factorization (2022)0.00
- Adaptive Value Decomposition With Greedy Marginal Contribution Computation For Cooperative Multi-agent Reinforcement Learning (2023)3.58
- Towards Understanding Cooperative Multi-agent Q-learning With Value Factorization (2020)0.00