Fast Multi-agent Temporal-difference Learning Via Homotopy Stochastic Primal-dual Optimization
2019 Β· Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, et al.
Abstract
We study the policy evaluation problem in multi-agent reinforcement learning where a group of agents, with jointly observed states and private local actions and rewards, collaborate to learn the value function of a given policy via local computation and communication over a connected undirected network. This problem arises in various large-scale multi-agent systems, including power grids, intelligent transportation systems, wireless sensor networks, and multi-agent robotics. When the dimension of state-action space is large, the temporal-difference learning with linear function approximation is widely used. In this paper, we develop a new distributed temporal-difference learning algorithm and quantify its finite-time performance. Our algorithm combines a distributed stochastic primal-dual method with a homotopy-based approach to adaptively adjust the learning rate in order to minimize the mean-square projected Bellman error by taking fresh online samples from a causal on-policy traject
Authors
(none)
Tags
Stats
Related papers
- Finite-time Performance Of Distributed Temporal Difference Learning With Linear Function Approximation (2019)9.59
- Finite-sample Analysis Of Decentralized Temporal-difference Learning With Linear Function Approximation (2019)0.00
- Distributed Value Function Approximation For Collaborative Multi-agent Reinforcement Learning (2020)8.60
- A Multi-agent Off-policy Actor-critic Algorithm For Distributed Reinforcement Learning (2019)11.39
- Multi-agent Policy Optimization With Approximatively Synchronous Advantage Estimation (2020)0.00
- Local Stochastic Approximation: A Unified View Of Federated Learning And Distributed Multi-task Reinforcement Learning Algorithms (2020)0.00
- Multi-agent Reinforcement Learning Via Double Averaging Primal-dual Optimization (2018)0.00
- Adaptive Temporal-difference Learning For Policy Evaluation With Per-state Uncertainty Estimates (2019)0.00