The Gradient Convergence Bound Of Federated Multi-agent Reinforcement Learning With Efficient Communication
2021 Β· Xing Xu, Rongpeng Li, Zhifeng Zhao, et al.
Abstract
The paper considers independent reinforcement learning (IRL) for multi-agent collaborative decision-making in the paradigm of federated learning (FL). However, FL generates excessive communication overheads between agents and a remote central server, especially when it involves a large number of agents or iterations. Besides, due to the heterogeneity of independent learning environments, multiple agents may undergo asynchronous Markov decision processes (MDPs), which will affect the training samples and the model's convergence performance. On top of the variation-aware periodic averaging (VPA) method and the policy-based deep reinforcement learning (DRL) algorithm (i.e., proximal policy optimization (PPO)), this paper proposes two advanced optimization schemes orienting to stochastic gradient descent (SGD): 1) A decay-based scheme gradually decays the weights of a model's local gradients with the progress of successive local updates, and 2) By representing the agents as a graph, a cons
Authors
(none)
Tags
Stats
Related papers
- Communication-efficient Consensus Mechanism For Federated Reinforcement Learning (2022)6.77
- Global Convergence Guarantees For Federated Policy Gradient Methods With Adversaries (2024)0.00
- Communication-efficient Policy Gradient Methods For Distributed Reinforcement Learning (2018)13.05
- Improved Communication Efficiency In Federated Natural Policy Gradient Via Admm-based Gradient Updates (2023)0.00
- Asynchronous Federated Reinforcement Learning With Policy Gradient Updates: Algorithm Design And Convergence Analysis (2024)0.00
- Provably Efficient Multi-agent Reinforcement Learning With Fully Decentralized Communication (2021)0.00
- Optimized Local Updates In Federated Learning Via Reinforcement Learning (2025)0.00
- Federated Natural Policy Gradient And Actor Critic Methods For Multi-task Reinforcement Learning (2023)0.00