Natural Policy Gradient And Actor Critic Methods For Constrained Multi-task Reinforcement Learning
2024 Β· Sihan Zeng, Thinh T. Doan, Justin Romberg
Abstract
Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves multiple tasks at the same time. This paper presents a constrained formulation for multi-task RL where the goal is to maximize the average performance of the policy across tasks subject to bounds on the performance in each task. We consider solving this problem both in the centralized setting, where information for all tasks is accessible to a single server, and in the decentralized setting, where a network of agents, each given one task and observing local information, cooperate to find the solution of the globally constrained objective using local communication. We first propose a primal-dual algorithm that provably converges to the globally optimal solution of this constrained formulation under exact gradient evaluations. When the gradient is unknown, we further develop a sampled-based actor-critic algorithm that finds the optimal policy using online samples of state, action, and reward. Fi
Authors
(none)
Tags
Stats
Related papers
- Federated Natural Policy Gradient And Actor Critic Methods For Multi-task Reinforcement Learning (2023)0.00
- Actor-critic Policy Optimization In Partially Observable Multiagent Environments (2018)0.00
- A Decentralized Policy Gradient Approach To Multi-task Reinforcement Learning (2020)0.00
- Local Advantage Actor-critic For Robust Multi-agent Deep Reinforcement Learning (2021)7.81
- Actor-critic Algorithms For Constrained Multi-agent Reinforcement Learning (2019)0.00
- Attention Actor-critic Algorithm For Multi-agent Constrained Co-operative Reinforcement Learning (2021)0.00
- Multi-preference Actor Critic (2019)0.00
- Parameter Sharing Deep Deterministic Policy Gradient For Cooperative Multi-agent Reinforcement Learning (2017)0.00