Minimax-optimal Multi-agent RL In Markov Games With A Generative Model
2022 Β· Gen Li, Yuejie Chi, Yuting Wei, et al.
Abstract
This paper studies multi-agent reinforcement learning in Markov games, with the goal of learning Nash equilibria or coarse correlated equilibria (CCE) sample-optimally. All prior results suffer from at least one of the two obstacles: the curse of multiple agents and the barrier of long horizon, regardless of the sampling protocol in use. We take a step towards settling this problem, assuming access to a flexible sampling mechanism: the generative model. Focusing on non-stationary finite-horizon Markov games, we develop a fast learning algorithm called \myalg~and an adaptive sampling scheme that leverage the optimism principle in online adversarial learning (particularly the Follow-the-Regularized-Leader (FTRL) method). Our algorithm learns an \(\epsilon\)-approximate CCE in a general-sum Markov game using $\( \widetilde\{O\}\bigg( \frac\{H^4 S \sum_\{i=1\}^m A_i\}\{\epsilon^2\} \bigg) \)\( samples, where \)m\( is the number of players, \)S\( indicates the number of states, \)H\( is the
Authors
(none)
Tags
Stats
Related papers
- Minimax-optimal Multi-agent Robust Reinforcement Learning (2024)0.00
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- Breaking The Curse Of Multiagents In A Large State Space: RL In Markov Games With Independent Linear Function Approximation (2023)0.00
- Strategically Efficient Exploration In Competitive Multi-agent Reinforcement Learning (2021)0.00
- Provably Efficient Reinforcement Learning In Decentralized General-sum Markov Games (2021)0.00
- Sample-efficient Reinforcement Learning Of Partially Observable Markov Games (2022)0.00
- Breaking The Curse Of Multiagency In Robust Multi-agent Reinforcement Learning (2024)0.00
- Provably Efficient Information-directed Sampling Algorithms For Multi-agent Reinforcement Learning (2024)2.26