A Generalized Training Approach For Multiagent Learning
2019 Β· Paul Muller, Shayegan Omidshafiei, Mark Rowland, et al.
Abstract
This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle as special cases, and (2) in principle applies to general-sum, many-player games. Despite this, prior studies of PSRO have been focused on two-player zero-sum games, a regime wherein Nash equilibria are tractably computable. In moving from two-player zero-sum games to more general settings, computation of Nash equilibria quickly becomes infeasible. Here, we extend the theoretical underpinnings of PSRO by considering an alternative solution concept, \(\alpha\)-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and applies readily to general-sum, many-player settings. We establish convergence guarantees in several games classes, and identify links between Nash equilibria and \(\alpha\)-Rank. We demonstrate
Authors
(none)
Tags
Stats
Related papers
- Learning Equilibria In Mean-field Games: Introducing Mean-field PSRO (2021)0.00
- Multi-agent Training Beyond Zero-sum With Correlated Equilibrium Meta-solvers (2021)0.00
- Fictitious Cross-play: Learning Global Nash Equilibrium In Mixed Cooperative-competitive Games (2023)3.58
- Pipeline PSRO: A Scalable Approach For Finding Approximate Nash Equilibria In Large Games (2020)0.00
- Policy Optimization For Markov Games: Unified Framework And Faster Convergence (2022)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Options As Responses: Grounding Behavioural Hierarchies In Multi-agent RL (2019)0.00
- Generative Evolutionary Meta-solver (GEMS): Scalable Surrogate-free Multi-agent Reinforcement Learning (2025)0.00