Scalable And Sample Efficient Distributed Policy Gradient Algorithms In Multi-agent Networked Systems
2022 Β· Xin Liu, Honghao Wei, Lei Ying
Abstract
This paper studies a class of multi-agent reinforcement learning (MARL) problems where the reward that an agent receives depends on the states of other agents, but the next state only depends on the agent's own current state and action. We name it REC-MARL standing for REward-Coupled Multi-Agent Reinforcement Learning. REC-MARL has a range of important applications such as real-time access control and distributed power control in wireless networks. This paper presents a distributed policy gradient algorithm for REC-MARL. The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information. The learned algorithm achieves a stationary policy and its iterative complexity bounds depend on the dimension of local states and actions. The experimental results of our algorithm
Authors
(none)
Tags
Stats
Related papers
- Distributed Policy Gradient With Variance Reduction In Multi-agent Reinforcement Learning (2021)0.00
- Descent-guided Policy Gradient For Scalable Cooperative Multi-agent Learning (2026)0.00
- Multi-agent Reinforcement Learning In Stochastic Networked Systems (2020)0.00
- Communication-efficient Policy Gradient Methods For Distributed Reinforcement Learning (2018)13.05
- Scalable Multi-agent Reinforcement Learning For Networked Systems With Average Reward (2020)0.00
- Scalable Centralized Deep Multi-agent Reinforcement Learning Via Policy Gradients (2018)0.00
- A Policy Gradient Algorithm For Learning To Learn In Multiagent Reinforcement Learning (2020)0.00
- Fully Decentralized Multi-agent Reinforcement Learning With Networked Agents (2018)0.00