Bayesian Action Decoder For Deep Multi-agent Reinforcement Learning
2018 Β· Jakob N. Foerster, Francis Song, Edward Hughes, et al.
Abstract
When observing the actions of others, humans make inferences about why they acted as they did, and what this implies about the world; humans also use the fact that their actions will be interpreted in this manner, allowing them to act informatively and thereby communicate efficiently with others. Although learning algorithms have recently achieved superhuman performance in a number of two-player, zero-sum games, scalable multi-agent reinforcement learning algorithms that can discover effective strategies and conventions in complex, partially observable settings have proven elusive. We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment. BAD introduces a new Markov decision process, the public belief MDP, in which the action space consists of all deterministic partial policies, and exploits the fact that an agent acting only on
Authors
(none)
Tags
Stats
Related papers
- Simplified Action Decoder For Deep Multi-agent Reinforcement Learning (2019)4.03
- Solving Common-payoff Games With Approximate Policy Iteration (2021)3.58
- Context-aware Bayesian Network Actor-critic Methods For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Multi-agent Deep Reinforcement Learning With Extremely Noisy Observations (2018)0.00
- Deep Interactive Bayesian Reinforcement Learning Via Meta-learning (2021)5.24
- Macro-action-based Deep Multi-agent Reinforcement Learning (2020)0.00
- ACE: Cooperative Multi-agent Q-learning With Bidirectional Action-dependency (2022)0.00
- Macro-action-based Multi-agent/robot Deep Reinforcement Learning Under Partial Observability (2022)5.84