Learning Mean-field Games Through Mean-field Actor-critic Flow
2025 Β· Mo Zhou, Haosheng Zhou, Ruimeng Hu
Abstract
We propose the Mean-Field Actor-Critic (MFAC) flow, a continuous-time learning dynamics for solving mean-field games (MFGs), combining techniques from reinforcement learning and optimal transport. The MFAC framework jointly evolves the control (actor), value function (critic), and distribution components through coupled gradient-based updates governed by partial differential equations (PDEs). A central innovation is the Optimal Transport Geodesic Picard (OTGP) flow, which drives the distribution toward equilibrium along Wasserstein-2 geodesics. We conduct a rigorous convergence analysis using Lyapunov functionals and establish global exponential convergence of the MFAC flow under a suitable timescale. Our results highlight the algorithmic interplay among actor, critic, and distribution components. Numerical experiments illustrate the theoretical findings and demonstrate the effectiveness of the MFAC framework in computing MFG equilibria.
Authors
(none)
Tags
Stats
Related papers
- Convergence Of Actor-critic Learning For Mean Field Games And Mean Field Control In Continuous Spaces (2025)0.00
- Deep Reinforcement Learning For Infinite Horizon Mean Field Problems In Continuous Spaces (2023)3.58
- Efficient And Scalable Deep Reinforcement Learning For Mean Field Control Games (2024)0.00
- A Single Online Agent Can Efficiently Learn Mean Field Games (2024)0.00
- Actor Critic Learning Algorithms For Mean-field Control With Moment Neural Networks (2023)0.00
- Unified Reinforcement Q-learning For Mean Field Game And Control Problems (2020)0.00
- Actor-critic Learning For Mean-field Control In Continuous Time (2023)0.00
- Global Convergence Of Policy Gradient For Linear-quadratic Mean-field Control/game In Continuous Time (2020)0.00