Provably Efficient Reinforcement Learning In Decentralized General-sum Markov Games
2021 Β· Weichao Mao, Tamer BaΕar
Abstract
This paper addresses the problem of learning an equilibrium efficiently in general-sum Markov games through decentralized multi-agent reinforcement learning. Given the fundamental difficulty of calculating a Nash equilibrium (NE), we instead aim at finding a coarse correlated equilibrium (CCE), a solution concept that generalizes NE by allowing possible correlations among the agents' strategies. We propose an algorithm in which each agent independently runs optimistic V-learning (a variant of Q-learning) to efficiently explore the unknown environment, while using a stabilized online mirror descent (OMD) subroutine for policy updates. We show that the agents can find an \(\epsilon\)-approximate CCE in at most \(\widetilde\{O\}( H^6S A /\epsilon^2)\) episodes, where \(S\) is the number of states, \(A\) is the size of the largest individual action space, and \(H\) is the length of an episode. This appears to be the first sample complexity result for learning in generic general-sum Markov
Authors
(none)
Tags
Stats
Related papers
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Decentralized MARL For Coarse Correlated Equilibrium In Aggregative Markov Games (2026)0.00
- Near Optimal Convergence To Coarse Correlated Equilibrium In General-sum Markov Games (2025)0.00
- Strategically Robust Multi-agent Reinforcement Learning With Linear Function Approximation (2026)0.00
- V-learning -- A Simple, Efficient, Decentralized Algorithm For Multiagent RL (2021)0.00
- Hardness Of Independent Learning And Sparse Equilibrium Computation In Markov Games (2023)0.00
- On Improving Model-free Algorithms For Decentralized Multi-agent Reinforcement Learning (2021)0.00
- Decentralized Q-learning In Zero-sum Markov Games (2021)0.00