Federated Natural Policy Gradient And Actor Critic Methods For Multi-task Reinforcement Learning
2023 Β· Tong Yang, Shicong Cen, Yuting Wei, et al.
Abstract
Federated reinforcement learning (RL) enables collaborative decision making of multiple distributed agents without sharing local data trajectories. In this work, we consider a multi-task setting, in which each agent has its own private reward function corresponding to different tasks, while sharing the same transition kernel of the environment. Focusing on infinite-horizon Markov decision processes, the goal is to learn a globally optimal policy that maximizes the sum of the discounted total rewards of all the agents in a decentralized manner, where each agent only communicates with its neighbors over some prescribed graph topology. We develop federated vanilla and entropy-regularized natural policy gradient (NPG) methods in the tabular setting under softmax parameterization, where gradient tracking is applied to estimate the global Q-function to mitigate the impact of imperfect information sharing. We establish non-asymptotic global convergence guarantees under exact policy evaluati
Authors
(none)
Tags
Stats
Related papers
- Natural Policy Gradient And Actor Critic Methods For Constrained Multi-task Reinforcement Learning (2024)0.00
- Improved Communication Efficiency In Federated Natural Policy Gradient Via Admm-based Gradient Updates (2023)0.00
- Momentum For The Win: Collaborative Federated Reinforcement Learning Across Heterogeneous Environments (2024)0.00
- Dimension-free Rates For Natural Policy Gradient In Multi-agent Reinforcement Learning (2021)0.00
- Global Convergence Guarantees For Federated Policy Gradient Methods With Adversaries (2024)0.00
- Asynchronous Federated Reinforcement Learning With Policy Gradient Updates: Algorithm Design And Convergence Analysis (2024)0.00
- Symmetric (optimistic) Natural Policy Gradient For Multi-agent Learning With Parameter Convergence (2022)0.00
- Recurrent Natural Policy Gradient For Pomdps (2024)0.00