Oracle-free Reinforcement Learning In Mean-field Games Along A Single Sample Path
2022 Β· Muhammad Aneeq Uz Zaman, Alec Koppel, Sujay Bhatt, et al.
Abstract
We consider online reinforcement learning in Mean-Field Games (MFGs). Unlike traditional approaches, we alleviate the need for a mean-field oracle by developing an algorithm that approximates the Mean-Field Equilibrium (MFE) using the single sample path of the generic agent. We call this \{\it Sandbox Learning\}, as it can be used as a warm-start for any agent learning in a multi-agent non-cooperative setting. We adopt a two time-scale approach in which an online fixed-point recursion for the mean-field operates on a slower time-scale, in tandem with a control policy update on a faster time-scale for the generic agent. Given that the underlying Markov Decision Process (MDP) of the agent is communicating, we provide finite sample convergence guarantees in terms of convergence of the mean-field and control policy to the mean-field equilibrium. The sample complexity of the Sandbox learning algorithm is \(\tilde\{\mathcal\{O\}\}(\epsilon^\{-4\})\) where \(\epsilon\) is the MFE approximatio
Authors
(none)
Tags
Stats
Related papers
- A Single Online Agent Can Efficiently Learn Mean Field Games (2024)0.00
- Reinforcement Learning For Mean Field Games With Strategic Complementarities (2020)0.00
- On The Convergence Of Model Free Learning In Mean Field Games (2019)0.00
- Population-aware Online Mirror Descent For Mean-field Games By Deep Reinforcement Learning (2024)0.00
- A General Framework For Learning Mean-field Games (2020)0.00
- Scalable Offline Reinforcement Learning For Mean Field Games (2024)0.00
- Unified Reinforcement Q-learning For Mean Field Game And Control Problems (2020)0.00
- Deep Reinforcement Learning For Infinite Horizon Mean Field Problems In Continuous Spaces (2023)3.58