Option-critic In Cooperative Multi-agent Systems
2019 Β· Jhelum Chakravorty, Nadeem Ward, Julien Roy, et al.
Abstract
In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a *common information approach*. We use the notion of *common beliefs* and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, which uses centralized option evaluation and decentralized intra-option improvement. We theoretically analyze the asymptotic convergence of DOC and build a new multi-agent environment to demonstrate its validity. Our experiments empirically show that DOC performs competitively against baselines and scales with the number of agents.
Authors
(none)
Tags
Stats
Related papers
- Attention Option-critic (2022)0.00
- Macoptions: Multi-agent Learning With Centralized Controller And Options Framework (2023)0.00
- MACRPO: Multi-agent Cooperative Recurrent Policy Optimization (2021)0.00
- Asynchronous, Option-based Multi-agent Policy Gradient: A Conditional Reasoning Approach (2022)0.00
- SOAP-RL: Sequential Option Advantage Propagation For Reinforcement Learning In POMDP Environments (2024)0.00
- Multi-agent Deep Covering Skill Discovery (2022)0.00
- Decision-making With Speculative Opponent Models (2022)2.26
- Enhancing Multi-agent Coordination Through Common Operating Picture Integration (2023)0.00