ME-IGM: Individual-global-max In Maximum Entropy Multi-agent Reinforcement Learning
2024 Β· Wen-Tse Chen, Yuxuan Li, Shiyu Huang, et al.
Abstract
Multi-agent credit assignment is a fundamental challenge for cooperative multi-agent reinforcement learning (MARL), where a team of agents learn from shared reward signals. The Individual-Global-Max (IGM) condition is a widely used principle for multi-agent credit assignment, requiring that the joint action determined by individual Q-functions maximizes the global Q-value. Meanwhile, the principle of maximum entropy has been leveraged to enhance exploration in MARL. However, we identify a critical limitation in existing maximum entropy MARL methods: a misalignment arises between local policies and the joint policy that maximizes the global Q-value, leading to violations of the IGM condition. To address this misalignment, we propose an order-preserving transformation. Building on it, we introduce ME-IGM, a novel maximum entropy MARL algorithm compatible with any credit assignment mechanism that satisfies the IGM condition while enjoying the benefits of maximum entropy exploration. We em
Authors
(none)
Tags
Stats
Related papers
- Rethinking Individual Global Max In Cooperative Multi-agent Reinforcement Learning (2022)0.00
- Maximum Entropy Heterogeneous-agent Reinforcement Learning (2023)0.00
- Cooperative Game-theoretic Credit Assignment For Multi-agent Policy Gradients Via The Core (2025)0.00
- Learning Explicit Credit Assignment For Cooperative Multi-agent Reinforcement Learning Via Polarization Policy Gradient (2022)4.52
- FM3Q: Factorized Multi-agent Minimax Q-learning For Two-team Zero-sum Markov Game (2024)6.34
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- Asynchronous Credit Assignment For Multi-agent Reinforcement Learning (2024)0.00
- GHQ: Grouped Hybrid Q Learning For Heterogeneous Cooperative Multi-agent Reinforcement Learning (2023)6.34