Mean-field Reinforcement Learning Without Synchrony
2026 Β· Shan Yang
Abstract
Mean-field reinforcement learning (MF-RL) scales multi-agent RL to large populations by reducing each agent's dependence on others to a single summary statistic -- the mean action. However, this reduction requires every agent to act at every time step; when some agents are idle, the mean action is simply undefined. Addressing asynchrony therefore requires a different summary statistic -- one that remains defined regardless of which agents act. The population distribution \(\mu \in \Delta(\mathcal\{O\})\) -- the fraction of agents at each observation -- satisfies this requirement: its dimension is independent of \(N\), and under exchangeability it fully determines each agent's reward and transition. Existing MF-RL theory, however, is built on the mean action and does not extend to \(\mu\). We therefore construct the Temporal Mean Field (TMF) framework around the population distribution \(\mu\) from scratch, covering the full spectrum from fully synchronous to purely sequential decision-
Authors
(none)
Tags
Stats
Related papers
- Mean Field Multi-agent Reinforcement Learning (2018)2.26
- Model-free Mean-field Reinforcement Learning: Mean-field MDP And Mean-field Q-learning (2019)0.00
- Partially Observable Mean Field Reinforcement Learning (2020)0.00
- Causal Mean Field Multi-agent Reinforcement Learning (2025)2.26
- Efficient Model-based Multi-agent Mean-field Reinforcement Learning (2021)0.00
- MF-OML: Online Mean-field Reinforcement Learning With Occupation Measures For Large Population Games (2024)3.58
- Robust Cooperative Multi-agent Reinforcement Learning:a Mean-field Type Game Perspective (2024)0.00
- Population-aware Online Mirror Descent For Mean-field Games By Deep Reinforcement Learning (2024)0.00