Policy Mirror Ascent For Efficient And Independent Learning In Mean Field Games
2022 Β· Batuhan Yardim, Semih Cayci, Matthieu Geist, et al.
Abstract
Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous \(N\)-player games. However, limiting applicability, existing theoretical results assume variations of a "population generative model", which allows arbitrary modifications of the population distribution by the learning algorithm. Moreover, learning algorithms typically work on abstract simulators with population instead of the \(N\)-player game. Instead, we show that \(N\) agents running policy mirror ascent converge to the Nash equilibrium of the regularized game within \(\widetilde\{\mathcal\{O\}\}(\epsilon^\{-2\})\) samples from a single sample trajectory without a population generative model, up to a standard \(\mathcal\{O\}(\frac\{1\}\{\sqrt\{N\}\})\) error due to the mean field. Taking a divergent approach from the literature, instead of working with the best-response map we first show that a policy mirror ascent map can be used to construct a contractive o
Authors
(none)
Tags
Stats
Related papers
- Population-aware Online Mirror Descent For Mean-field Games By Deep Reinforcement Learning (2024)0.00
- Local And Adaptive Mirror Descents In Extensive-form Games (2023)0.00
- Networked Communication For Mean-field Games With Function Approximation And Empirical Mean-field Estimation (2024)0.00
- Independent Policy Mirror Descent For Markov Potential Games: Scaling To Large Number Of Players (2024)0.00
- Scalable Offline Reinforcement Learning For Mean Field Games (2024)0.00
- Learning Equilibria In Mean-field Games: Introducing Mean-field PSRO (2021)0.00
- Multi-agent Online Learning In Time-varying Games (2018)8.82
- Independent Policy Gradient For Large-scale Markov Potential Games: Sharper Rates, Function Approximation, And Game-agnostic Convergence (2022)0.00