FM3Q: Factorized Multi-agent Minimax Q-learning For Two-team Zero-sum Markov Game
2024 Β· Guangzheng Hu, Yuanheng Zhu, Haoran Li, et al.
Abstract
Many real-world applications involve some agents that fall into two teams, with payoffs that are equal within the same team but of opposite sign across the opponent team. The so-called two-team zero-sum Markov games (2t0sMGs) can be resolved with reinforcement learning in recent years. However, existing methods are thus inefficient in light of insufficient consideration of intra-team credit assignment, data utilization and computational intractability. In this paper, we propose the individual-global-minimax (IGMM) principle to ensure the coherence between two-team minimax behaviors and the individual greedy behaviors through Q functions in 2t0sMGs. Based on it, we present a novel multi-agent reinforcement learning framework, Factorized Multi-Agent MiniMax Q-Learning (FM3Q), which can factorize the joint minimax Q function into individual ones and iteratively solve for the IGMM-satisfied minimax Q functions for 2t0sMGs. Moreover, an online learning algorithm with neural networks is prop
Authors
(none)
Tags
Stats
Related papers
- A Generalized Minimax Q-learning Algorithm For Two-player Zero-sum Stochastic Games (2019)9.03
- Decentralized Q-learning In Zero-sum Markov Games (2021)0.00
- Minimax-optimal Multi-agent Robust Reinforcement Learning (2024)0.00
- Finite-time Analysis Of Minimax Q-learning For Two-player Zero-sum Markov Games: Switching System Approach (2023)0.00
- Factorized Q-learning For Large-scale Multi-agent Systems (2018)11.58
- ME-IGM: Individual-global-max In Maximum Entropy Multi-agent Reinforcement Learning (2024)0.00
- Mitigating Relative Over-generalization In Multi-agent Reinforcement Learning (2024)0.00
- Analysis Of Multiscale Reinforcement Q-learning Algorithms For Mean Field Control Games (2024)0.00