Navigating Trade-offs: Policy Summarization For Multi-objective Reinforcement Learning
2024 Β· Zuzanna Osika, Jazmin Zatarain-Salazar, Frans A. Oliehoek, et al.
Abstract
Multi-objective reinforcement learning (MORL) is used to solve problems involving multiple objectives. An MORL agent must make decisions based on the diverse signals provided by distinct reward functions. Training an MORL agent yields a set of solutions (policies), each presenting distinct trade-offs among the objectives (expected returns). MORL enhances explainability by enabling fine-grained comparisons of policies in the solution set based on their trade-offs as opposed to having a single policy. However, the solution set is typically large and multi-dimensional, where each policy (e.g., a neural network) is represented by its objective values. We propose an approach for clustering the solution set generated by MORL. By considering both policy behavior and objective values, our clustering method can reveal the relationship between policy behaviors and regions in the objective space. This approach can enable decision makers (DMs) to identify overarching trends and insights in the s
Authors
(none)
Tags
Stats
Related papers
- Interpretability By Design For Efficient Multi-objective Reinforcement Learning (2025)0.00
- Multi-objective Reinforcement Learning Based On Decomposition: A Taxonomy And Framework (2023)9.92
- On Generalization Across Environments In Multi-objective Reinforcement Learning (2025)0.00
- A Generalized Algorithm For Multi-objective Reinforcement Learning And Policy Adaptation (2019)0.00
- Addressing The Issue Of Stochastic Environments And Local Decision-making In Multi-objective Reinforcement Learning (2022)0.00
- Using Logical Specifications Of Objectives In Multi-objective Reinforcement Learning (2019)0.00
- Provable Multi-objective Reinforcement Learning With Generative Models (2020)0.00
- Sample-efficient Multi-objective Learning Via Generalized Policy Improvement Prioritization (2023)5.24