A Decentralized Communication Framework Based On Dual-level Recurrence For Multi-agent Reinforcement Learning
2022 Β· Jingchen Li, Haobin Shi, Kao-Shing Hwang
Abstract
We propose a model enabling decentralized multiple agents to share their perception of environment in a fair and adaptive way. In our model, both the current message and historical observation are taken into account, and they are handled in the same recurrent model but in different forms. We present a dual-level recurrent communication framework for multi-agent systems, in which the first recurrence occurs in the communication sequence and is used to transmit communication data among agents, while the second recurrence is based on the time sequence and combines the historical observations for each agent. The developed communication flow separates communication messages from memories but allows agents to share their historical observations by the dual-level recurrence. This design makes agents adapt to changeable communication objects, while the communication results are fair to these agents. We provide a sufficient discussion about our method in both partially observable and fully obse
Authors
(none)
Tags
Stats
Related papers
- Improving Coordination In Small-scale Multi-agent Deep Reinforcement Learning Through Memory-driven Communication (2019)12.25
- Attention-based Recurrence For Multi-agent Reinforcement Learning Under Stochastic Partial Observability (2023)0.00
- Provably Efficient Multi-agent Reinforcement Learning With Fully Decentralized Communication (2021)0.00
- Mixed Cooperative-competitive Communication Using Multi-agent Reinforcement Learning (2021)5.84
- Learning Emergent Discrete Message Communication For Cooperative Reinforcement Learning (2021)5.24
- Counterfactual Multi-agent Reinforcement Learning With Graph Convolution Communication (2020)0.00
- Modeling Sensorimotor Coordination As Multi-agent Reinforcement Learning With Differentiable Communication (2019)0.00
- Deep Decentralized Multi-task Multi-agent Reinforcement Learning Under Partial Observability (2017)0.00