On The Approximation Of Cooperative Heterogeneous Multi-agent Reinforcement Learning (MARL) Using Mean Field Control (MFC)
2021 Β· Washim Uddin Mondal, Mridul Agarwal, Vaneet Aggarwal, et al.
Abstract
Mean field control (MFC) is an effective way to mitigate the curse of dimensionality of cooperative multi-agent reinforcement learning (MARL) problems. This work considers a collection of \(N_\{\mathrm\{pop\}\}\) heterogeneous agents that can be segregated into \(K\) classes such that the \(k\)-th class contains \(N_k\) homogeneous agents. We aim to prove approximation guarantees of the MARL problem for this heterogeneous system by its corresponding MFC problem. We consider three scenarios where the reward and transition dynamics of all agents are respectively taken to be functions of \((1)\) joint state and action distributions across all classes, \((2)\) individual distributions of each class, and \((3)\) marginal distributions of the entire population. We show that, in these cases, the \(K\)-class MARL problem can be approximated by MFC with errors given as \(e_1=\mathcal\{O\}(\frac\{\sqrt\{|\mathcal\{X\}|\}+\sqrt\{|\mathcal\{U\}|\}\}\{N_\{\mathrm\{pop\}\}\}\sum_\{k\}\sqrt\{N_k\})\)
Authors
(none)
Tags
Stats
Related papers
- Can Mean Field Control (MFC) Approximate Cooperative Multi Agent Reinforcement Learning (MARL) With Non-uniform Interaction? (2022)0.00
- Mean-field Approximation Of Cooperative Constrained Multi-agent Reinforcement Learning (CMARL) (2022)0.00
- Mean-field Control Based Approximation Of Multi-agent Reinforcement Learning In Presence Of A Non-decomposable Shared Global State (2023)0.00
- Major-minor Mean Field Multi-agent Reinforcement Learning (2023)0.00
- Efficient Model-based Multi-agent Mean-field Reinforcement Learning (2021)0.00
- Efficient And Scalable Deep Reinforcement Learning For Mean Field Control Games (2024)0.00
- Maximum Entropy Heterogeneous-agent Reinforcement Learning (2023)0.00
- MCMARL: Parameterizing Value Function Via Mixture Of Categorical Distributions For Multi-agent Reinforcement Learning (2022)0.00