Sample-efficient Reinforcement Learning Of Partially Observable Markov Games
2022 · Qinghua Liu, Csaba Szepesvári, Chi Jin
Abstract
This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system. This paper studies these tasks under the general model of multiplayer general-sum Partially Observable Markov Games (POMGs), which is significantly larger than the standard model of Imperfect Information Extensive-Form Games (IIEFGs). We identify a rich subclass of POMGs -- weakly revealing POMGs -- in which sample-efficient learning is tractable. In the self-play setting, we prove that a simple algorithm combining optimism and Maximum Likelihood Estimation (MLE) is sufficient to find approximate Nash equilibria, correlated equilibria, as well as coarse correlated equilibria of weakly revealing POMGs, in a polynomial number of samples when the number of agents is small. In the setting of playing against adversarial opponents, we
Authors
(none)
Tags
Stats
Related papers
- Incentivize Without Bonus: Provably Efficient Model-based Online Multi-agent RL For Markov Games (2025)0.00
- Robustness And Sample Complexity Of Model-based MARL For General-sum Markov Games (2021)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- On The Complexity Of Multi-agent Decision Making: From Learning In Games To Partial Monitoring (2023)0.00
- Remembering The Markov Property In Cooperative MARL (2025)0.00
- Information State Embedding In Partially Observable Cooperative Multi-agent Reinforcement Learning (2020)0.00
- Breaking The Curse Of Multiagency In Robust Multi-agent Reinforcement Learning (2024)0.00
- Efficient Model-based Multi-agent Reinforcement Learning Via Optimistic Equilibrium Computation (2022)0.00