Simplex Neural Population Learning: Any-mixture Bayes-optimality In Symmetric Zero-sum Games
2022 Β· Siqi Liu, Marc Lanctot, Luke Marris, et al.
Abstract
Learning to play optimally against any mixture over a diverse set of strategies is of important practical interests in competitive games. In this paper, we propose simplex-NeuPL that satisfies two desiderata simultaneously: i) learning a population of strategically diverse basis policies, represented by a single conditional network; ii) using the same network, learn best-responses to any mixture over the simplex of basis policies. We show that the resulting conditional policies incorporate prior information about their opponents effectively, enabling near optimal returns against arbitrary mixture policies in a game with tractable best-responses. We verify that such policies behave Bayes-optimally under uncertainty and offer insights in using this flexibility at test time. Finally, we offer evidence that learning best-responses to any mixture policies is an effective auxiliary task for strategic exploration, which, by itself, can lead to more performant populations.
Authors
(none)
Tags
Stats
Related papers
- Neupl: Neural Population Learning (2022)0.00
- Neural Population Learning Beyond Symmetric Zero-sum Games (2024)0.00
- Learning In Zero-sum Markov Games: Relaxing Strong Reachability And Mixing Time Assumptions (2023)0.00
- Learning To Play Against Any Mixture Of Opponents (2020)0.00
- Convergence Of Heterogeneous Learning Dynamics In Zero-sum Stochastic Games (2023)2.26
- Learning Two-player Mixture Markov Games: Kernel Function Approximation And Correlated Equilibrium (2022)0.00
- For Learning In Symmetric Teams, Local Optima Are Global Nash Equilibria (2022)0.00
- Policy Optimization For Markov Games: Unified Framework And Faster Convergence (2022)0.00