Residual Q-networks For Value Function Factorizing In Multi-agent Reinforcement Learning
2022 Β· Rafael Pina, Varuna de Silva, Joosep Hook, et al.
Abstract
Multi-Agent Reinforcement Learning (MARL) is useful in many problems that require the cooperation and coordination of multiple agents. Learning optimal policies using reinforcement learning in a multi-agent setting can be very difficult as the number of agents increases. Recent solutions such as Value Decomposition Networks (VDN), QMIX, QTRAN and QPLEX adhere to the centralized training and decentralized execution scheme and perform factorization of the joint action-value functions. However, these methods still suffer from increased environmental complexity, and at times fail to converge in a stable manner. We propose a novel concept of Residual Q-Networks (RQNs) for MARL, which learns to transform the individual Q-value trajectories in a way that preserves the Individual-Global-Max criteria (IGM), but is more robust in factorizing action-value functions. The RQN acts as an auxiliary network that accelerates convergence and will become obsolete as the agents reach the training objectiv
Authors
(none)
Tags
Stats
Related papers
- Qfree: A Universal Value Function Factorization For Multi-agent Reinforcement Learning (2023)0.00
- NQMIX: Non-monotonic Value Function Factorization For Deep Multi-agent Reinforcement Learning (2021)0.00
- Q-value Path Decomposition For Deep Multiagent Reinforcement Learning (2020)0.00
- Riskq: Risk-sensitive Multi-agent Reinforcement Learning Value Factorization (2023)2.46
- Qatten: A General Framework For Cooperative Multiagent Reinforcement Learning (2020)0.00
- Concaveq: Non-monotonic Value Function Factorization Via Concave Representations In Deep Multi-agent Reinforcement Learning (2023)5.84
- Monotonic Value Function Factorisation For Deep Multi-agent Reinforcement Learning (2020)0.00
- Weighted QMIX: Expanding Monotonic Value Function Factorisation For Deep Multi-agent Reinforcement Learning (2020)0.00