Scalable Planning In Multi-agent Mdps
2021 Β· Dinuka Sahabandu, Luyao Niu, Andrew Clark, et al.
Abstract
Multi-agent Markov Decision Processes (MMDPs) arise in a variety of applications including target tracking, control of multi-robot swarms, and multiplayer games. A key challenge in MMDPs occurs when the state and action spaces grow exponentially in the number of agents, making computation of an optimal policy computationally intractable for medium- to large-scale problems. One property that has been exploited to mitigate this complexity is transition independence, in which each agent's transition probabilities are independent of the states and actions of other agents. Transition independence enables factorization of the MMDP and computation of local agent policies but does not hold for arbitrary MMDPs. In this paper, we propose an approximate transition dependence property, called \(\delta\)-transition dependence and develop a metric for quantifying how far an MMDP deviates from transition independence. Our definition of \(\delta\)-transition dependence recovers transition independence
Authors
(none)
Tags
Stats
Related papers
- Model-free Learning And Optimal Policy Design In Multi-agent Mdps Under Probabilistic Agent Dropout (2023)2.26
- Scalable Spectral Representations For Multi-agent Reinforcement Learning In Network Mdps (2024)0.00
- Multi-agent Reach-avoid MDP Via Potential Games And Low-rank Policy Structure (2024)0.00
- Policy Dispersion In Non-markovian Environment (2023)0.00
- Continuous-time Distributed Dynamic Programming For Networked Multi-agent Markov Decision Processes (2023)2.26
- Planning And Learning In Average Risk-aware Mdps (2025)0.00
- Provable Cooperative Multi-agent Exploration For Reward-free Mdps (2026)0.00
- Decentralised Q-learning For Multi-agent Markov Decision Processes With A Satisfiability Criterion (2023)0.00