Interpretable Multi-objective Reinforcement Learning Through Policy Orchestration
2018 Β· Ritesh Noothigattu, Djallel Bouneffouf, Nicholas Mattei, et al.
Abstract
Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come from any number of sources including regulations, business process guidelines, laws, ethical principles, social norms, and moral values. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations of the task, and reinforcement learning to learn to maximize the environment rewards. More precisely, we assume that an agent can observe traces of behavior of members of the society but has no access to the explicit set of constraints that give rise to the observed behavior. Inverse reinforcement learning is used to learn such cons
Authors
(none)
Tags
Stats
Related papers
- Policy Composition In Reinforcement Learning Via Multi-objective Policy Optimization (2023)0.00
- A Regulation Enforcement Solution For Multi-agent Reinforcement Learning (2019)2.26
- A Dual Perspective Of Reinforcement Learning For Imposing Policy Constraints (2024)0.00
- Discovering Individual Rewards In Collective Behavior Through Inverse Multi-agent Reinforcement Learning (2023)0.00
- Online Matching Via Reinforcement Learning: An Expert Policy Orchestration Strategy (2025)0.00
- Learning Existing Social Conventions Via Observationally Augmented Self-play (2018)7.81
- Role Play: Learning Adaptive Role-specific Strategies In Multi-agent Interactions (2024)0.00
- Promoting Coordination Through Policy Regularization In Multi-agent Deep Reinforcement Learning (2019)0.00