BCR-DRL: Behavior- And Context-aware Reward For Deep Reinforcement Learning In Human-ai Coordination
2024 Β· Xin Hao, Bahareh Nakisa, Mohmmad Naim Rastgoo, et al.
Abstract
Deep reinforcement Learning (DRL) offers a powerful framework for training AI agents to coordinate with human partners. However, DRL faces two critical challenges in human-AI coordination (HAIC): sparse rewards and unpredictable human behaviors. These challenges significantly limit DRL to identify effective coordination policies, due to its impaired capability of optimizing exploration and exploitation. To address these limitations, we propose an innovative behavior- and context-aware reward (BCR) for DRL, which optimizes exploration and exploitation by leveraging human behaviors and contextual information in HAIC. Our BCR consists of two components: (i) A novel dual intrinsic rewarding scheme to enhance exploration. This scheme composes an AI self-motivated intrinsic reward and a human-motivated intrinsic reward, which are designed to increase the capture of sparse rewards by a logarithmic-based strategy; and (ii) A new context-aware weighting mechanism for the designed rewards to imp
Authors
(none)
Tags
Stats
Related papers
- Control-optimized Deep Reinforcement Learning For Artificially Intelligent Autonomous Systems (2025)0.00
- Towards Human-like RL: Taming Non-naturalistic Behavior In Deep RL Via Adaptive Behavioral Costs In 3D Games (2023)0.00
- DCIR: Dynamic Consistency Intrinsic Reward For Multi-agent Reinforcement Learning (2023)0.00
- A Hierarchical Approach To Population Training For Human-ai Collaboration (2023)0.00
- Aligning Humans And Robots Via Reinforcement Learning From Implicit Human Feedback (2025)2.26
- A Human Mixed Strategy Approach To Deep Reinforcement Learning (2018)7.50
- Broad Critic Deep Actor Reinforcement Learning For Continuous Control (2024)0.00
- Coordinated Exploration Via Intrinsic Rewards For Multi-agent Reinforcement Learning (2019)0.00