Local Advantage Networks For Cooperative Multi-agent Reinforcement Learning
2021 · Raphaël Avalos, Mathieu Reymond, Ann Nowé, et al.
Abstract
Many recent successful off-policy multi-agent reinforcement learning (MARL) algorithms for cooperative partially observable environments focus on finding factorized value functions, leading to convoluted network structures. Building on the structure of independent Q-learners, our LAN algorithm takes a radically different approach, leveraging a dueling architecture to learn for each agent a decentralized best-response policies via individual advantage functions. The learning is stabilized by a centralized critic whose primary objective is to reduce the moving target problem of the individual advantages. The critic, whose network's size is independent of the number of agents, is cast aside after learning. Evaluation on the StarCraft II multi-agent challenge benchmark shows that LAN reaches state-of-the-art performance and is highly scalable with respect to the number of agents, opening up a promising alternative direction for MARL research.
Authors
(none)
Tags
Stats
Related papers
- Fully Decentralized Multi-agent Reinforcement Learning With Networked Agents (2018)0.00
- MARL-LNS: Cooperative Multi-agent Reinforcement Learning Via Large Neighborhoods Search (2024)0.00
- Locality Matters: A Scalable Value Decomposition Approach For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Decentralized Multi-agent Reinforcement Learning With Networked Agents: Recent Advances (2019)0.00
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Mean-field Multi-agent Reinforcement Learning: A Decentralized Network Approach (2021)0.00
- Learning To Share In Multi-agent Reinforcement Learning (2021)0.00