Decentralized Q-learning In Zero-sum Markov Games
2021 Β· Muhammed O. Sayin, Kaiqing Zhang, David S. Leslie, et al.
Abstract
We study multi-agent reinforcement learning (MARL) in infinite-horizon discounted zero-sum Markov games. We focus on the practical but challenging setting of decentralized MARL, where agents make decisions without coordination by a centralized controller, but only based on their own payoffs and local actions executed. The agents need not observe the opponent's actions or payoffs, possibly being even oblivious to the presence of the opponent, nor be aware of the zero-sum structure of the underlying game, a setting also referred to as radically uncoupled in the literature of learning in games. In this paper, we develop a radically uncoupled Q-learning dynamics that is both rational and convergent: the learning dynamics converges to the best response to the opponent's strategy when the opponent follows an asymptotically stationary strategy; when both agents adopt the learning dynamics, they converge to the Nash equilibrium of the game. The key challenge in this decentralized setting is th
Authors
(none)
Tags
Stats
Related papers
- MA2QL: A Minimalist Approach To Fully Decentralized Multi-agent Reinforcement Learning (2022)0.00
- Faster Last-iterate Convergence Of Policy Optimization In Zero-sum Markov Games (2022)0.00
- Independent And Decentralized Learning In Markov Potential Games (2022)0.00
- V-learning -- A Simple, Efficient, Decentralized Algorithm For Multiagent RL (2021)0.00
- Decentralized Model-free Reinforcement Learning In Stochastic Games With Average-reward Objective (2023)0.00
- Decentralized Multi-agent Reinforcement Learning For Continuous-space Stochastic Games (2023)5.24
- On Improving Model-free Algorithms For Decentralized Multi-agent Reinforcement Learning (2021)0.00
- Provably Efficient Multi-agent Reinforcement Learning With Fully Decentralized Communication (2021)0.00