Theory Of Mind For Deep Reinforcement Learning In Hanabi
2021 Β· Andrew Fuchs, Michael Walton, Theresa Chadwick, et al.
Abstract
The partially observable card game Hanabi has recently been proposed as a new AI challenge problem due to its dependence on implicit communication conventions and apparent necessity of theory of mind reasoning for efficient play. In this work, we propose a mechanism for imbuing Reinforcement Learning agents with a theory of mind to discover efficient cooperative strategies in Hanabi. The primary contributions of this work are threefold: First, a formal definition of a computationally tractable mechanism for computing hand probabilities in Hanabi. Second, an extension to conventional Deep Reinforcement Learning that introduces reasoning over finitely nested theory of mind belief hierarchies. Finally, an intrinsic reward mechanism enabled by theory of mind that incentivizes agents to share strategically relevant private knowledge with their teammates. We demonstrate the utility of our algorithm against Rainbow, a state-of-the-art Reinforcement Learning agent.
Authors
(none)
Tags
Stats
Related papers
- Simplified Action Decoder For Deep Multi-agent Reinforcement Learning (2019)4.03
- Evaluating The Rainbow DQN Agent In Hanabi With Unseen Partners (2020)0.00
- Reinforcement Learning On Human Decision Models For Uniquely Collaborative AI Teammates (2021)0.00
- Evaluation Of Human-ai Teams For Learned And Rule-based Agents In Hanabi (2021)0.00
- Theory Of Mind As Intrinsic Motivation For Multi-agent Reinforcement Learning (2023)0.00
- Is Vanilla Policy Gradient Overlooked? Analyzing Deep Reinforcement Learning For Hanabi (2022)0.00
- Learning Human Rewards By Inferring Their Latent Intelligence Levels In Multi-agent Games: A Theory-of-mind Approach With Application To Driving Data (2021)0.00
- Behavioral Differences Is The Key Of Ad-hoc Team Cooperation In Multiplayer Games Hanabi (2023)0.00