Resolving Implicit Coordination In Multi-agent Deep Reinforcement Learning With Deep Q-networks & Game Theory
2020 Β· Griffin Adams, Sarguna Janani Padmanabhan, Shivang Shekhar
Abstract
We address two major challenges of implicit coordination in multi-agent deep reinforcement learning: non-stationarity and exponential growth of state-action space, by combining Deep-Q Networks for policy learning with Nash equilibrium for action selection. Q-values proxy as payoffs in Nash settings, and mutual best responses define joint action selection. Coordination is implicit because multiple/no Nash equilibria are resolved deterministically. We demonstrate that knowledge of game type leads to an assumption of mirrored best responses and faster convergence than Nash-Q. Specifically, the Friend-or-Foe algorithm demonstrates signs of convergence to a Set Controller which jointly chooses actions for two agents. This encouraging given the highly unstable nature of decentralized coordination over joint actions. Inspired by the dueling network architecture, which decouples the Q-function into state and advantage streams, as well as residual networks, we learn both a single and joint agen
Authors
(none)
Tags
Stats
Related papers
- Learning Multiagent Coordination In The Absence Of Communication Channels (2018)0.00
- Promoting Coordination Through Policy Regularization In Multi-agent Deep Reinforcement Learning (2019)0.00
- Qatten: A General Framework For Cooperative Multiagent Reinforcement Learning (2020)0.00
- Analysing Factorizations Of Action-value Networks For Cooperative Multi-agent Reinforcement Learning (2019)2.26
- On The Stability Of Learning In Network Games With Many Players (2024)0.00
- Multi-agent Actor-critic For Mixed Cooperative-competitive Environments (2017)0.00
- Weighted Double Deep Multiagent Reinforcement Learning In Stochastic Cooperative Environments (2018)0.00
- How Exploration Breaks Cooperation In Shared-policy Multi-agent Reinforcement Learning (2026)0.00