Abstract

Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of multipliers (ADMM) to approximate global NPG directions efficiently. We theoretically demonstrate that using ADMM-based gradient updates reduces communication complexity from \(\{O\}(\{d^\{2\}\})\) to \(\{O\}(\{d\})\) at each iteration, where \(d\) is the number of model parameters. Furthermore, we show that achieving an \(\epsilon\)-error stationary convergence requires \(\{O\}(\frac\{1\}\{(1-\gamma)^\{2\}\{\epsilon\}\})\) iterations for discount factor \(\gamma\), demonstrating that FedNPG-ADMM maintains the same convergence rate as the standard FedNPG. Through evaluation of the proposed algori

Authors

(none)

Tags

  • Policy Gradient
  • Model-Based RL

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keylan2023improved

Related papers