Probabilistic Recursive Reasoning For Multi-agent Reinforcement Learning
2019 Β· Ying Wen, Yaodong Yang, Rui Luo, et al.
Abstract
Humans are capable of attributing latent mental contents such as beliefs or intentions to others. The social skill is critical in daily life for reasoning about the potential consequences of others' behaviors so as to plan ahead. It is known that humans use such reasoning ability recursively by considering what others believe about their own beliefs. In this paper, we start from level-\(1\) recursion and introduce a probabilistic recursive reasoning (PR2) framework for multi-agent reinforcement learning. Our hypothesis is that it is beneficial for each agent to account for how the opponents would react to its future behaviors. Under the PR2 framework, we adopt variational Bayes methods to approximate the opponents' conditional policies, to which each agent finds the best response and then improve their own policies. We develop decentralized-training-decentralized-execution algorithms, namely PR2-Q and PR2-Actor-Critic, that are proved to converge in the self-play scenarios when there e
Authors
(none)
Tags
Stats
Related papers
- Modelling Bounded Rationality In Multi-agent Interactions By Generalized Recursive Reasoning (2019)9.23
- Theory Of Mind As Intrinsic Motivation For Multi-agent Reinforcement Learning (2023)0.00
- Competitive Multi-agent Deep Reinforcement Learning With Counterfactual Thinking (2019)7.16
- Iterated Reasoning With Mutual Information In Cooperative And Byzantine Decentralized Teaming (2022)0.00
- Bayesian Action Decoder For Deep Multi-agent Reinforcement Learning (2018)0.00
- Hrm-agent: Training A Recurrent Reasoning Model In Dynamic Environments Using Reinforcement Learning (2025)0.00
- Control As Probabilistic Inference As An Emergent Communication Mechanism In Multi-agent Reinforcement Learning (2023)0.00
- Episodic Future Thinking Mechanism For Multi-agent Reinforcement Learning (2024)0.00