Human-ai Learning Performance In Multi-armed Bandits
2018 Β· Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, et al.
Abstract
People frequently face challenging decision-making problems in which outcomes are uncertain or unknown. Artificial intelligence (AI) algorithms exist that can outperform humans at learning such tasks. Thus, there is an opportunity for AI agents to assist people in learning these tasks more effectively. In this work, we use a multi-armed bandit as a controlled setting in which to explore this direction. We pair humans with a selection of agents and observe how well each human-agent team performs. We find that team performance can beat both human and agent performance in isolation. Interestingly, we also find that an agent's performance in isolation does not necessarily correlate with the human-agent team's performance. A drop in agent performance can lead to a disproportionately large drop in team performance, or in some settings can even improve team performance. Pairing a human with an agent that performs slightly better than them can make them perform much better, while pairing them
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning On Human Decision Models For Uniquely Collaborative AI Teammates (2021)0.00
- Online Learning For Cooperative Multi-player Multi-armed Bandits (2021)5.24
- Near-optimal Collaborative Learning In Bandits (2022)0.00
- Evaluation Of Human-ai Teams For Learned And Rule-based Agents In Hanabi (2021)0.00
- In Pursuit Of Predictive Models Of Human Preferences Toward AI Teammates (2025)0.00
- Learning To Coordinate Under Threshold Rewards: A Cooperative Multi-agent Bandit Framework (2025)0.00
- Unified Models Of Human Behavioral Agents In Bandits, Contextual Bandits And RL (2020)8.35
- Decision Market Based Learning For Multi-agent Contextual Bandit Problems (2022)0.00