Qfree: A Universal Value Function Factorization For Multi-agent Reinforcement Learning
2023 Β· Rizhong Wang, Huiping Li, di Cui, et al.
Abstract
Centralized training is widely utilized in the field of multi-agent reinforcement learning (MARL) to assure the stability of training process. Once a joint policy is obtained, it is critical to design a value function factorization method to extract optimal decentralized policies for the agents, which needs to satisfy the individual-global-max (IGM) principle. While imposing additional limitations on the IGM function class can help to meet the requirement, it comes at the cost of restricting its application to more complex multi-agent environments. In this paper, we propose QFree, a universal value function factorization method for MARL. We start by developing mathematical equivalent conditions of the IGM principle based on the advantage function, which ensures that the principle holds without any compromise, removing the conservatism of conventional methods. We then establish a more expressive mixing network architecture that can fulfill the equivalent factorization. In particular, th
Authors
(none)
Tags
Stats
Related papers
- Residual Q-networks For Value Function Factorizing In Multi-agent Reinforcement Learning (2022)10.21
- Monotonic Value Function Factorisation For Deep Multi-agent Reinforcement Learning (2020)0.00
- Concaveq: Non-monotonic Value Function Factorization Via Concave Representations In Deep Multi-agent Reinforcement Learning (2023)5.84
- More Centralized Training, Still Decentralized Execution: Multi-agent Conditional Policy Factorization (2022)0.00
- QMIX: Monotonic Value Function Factorisation For Deep Multi-agent Reinforcement Learning (2018)0.00
- Beyond Monotonicity: Revisiting Factorization Principles In Multi-agent Q-learning (2025)0.00
- DFAC Framework: Factorizing The Value Function Via Quantile Mixture For Multi-agent Distributional Q-learning (2021)0.00
- Towards Understanding Cooperative Multi-agent Q-learning With Value Factorization (2020)0.00