QTRAN++: Improved Value Transformation For Cooperative Multi-agent Reinforcement Learning
2020 Β· Kyunghwan Son, Sungsoo Ahn, Roben Delos Reyes, et al.
Abstract
QTRAN is a multi-agent reinforcement learning (MARL) algorithm capable of learning the largest class of joint-action value functions up to date. However, despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments, such as Starcraft Multi-Agent Challenge (SMAC). In this paper, we identify the performance bottleneck of QTRAN and propose a substantially improved version, coined QTRAN++. Our gains come from (i) stabilizing the training objective of QTRAN, (ii) removing the strict role separation between the action-value estimators of QTRAN, and (iii) introducing a multi-head mixing network for value transformation. Through extensive evaluation, we confirm that our diagnosis is correct, and QTRAN++ successfully bridges the gap between empirical performance and theoretical guarantee. In particular, QTRAN++ newly achieves state-of-the-art performance in the SMAC environment. The code will be released.
Authors
(none)
Tags
Stats
Related papers
- Transformer-based Value Function Decomposition For Cooperative Multi-agent Reinforcement Learning In Starcraft (2022)8.82
- Residual Q-networks For Value Function Factorizing In Multi-agent Reinforcement Learning (2022)10.21
- NQMIX: Non-monotonic Value Function Factorization For Deep Multi-agent Reinforcement Learning (2021)0.00
- Qatten: A General Framework For Cooperative Multiagent Reinforcement Learning (2020)0.00
- Mixed Q-functionals: Advancing Value-based Methods In Cooperative MARL With Continuous Action Domains (2024)0.00
- Qfree: A Universal Value Function Factorization For Multi-agent Reinforcement Learning (2023)0.00
- Towards Multi-agent Reinforcement Learning Using Quantum Boltzmann Machines (2021)0.00
- QPLEX: Duplex Dueling Multi-agent Q-learning (2020)0.00