Addressing The Issue Of Stochastic Environments And Local Decision-making In Multi-objective Reinforcement Learning
2022 Β· Kewen Ding
Abstract
Multi-objective reinforcement learning (MORL) is a relatively new field which builds on conventional Reinforcement Learning (RL) to solve multi-objective problems. One of common algorithm is to extend scalar value Q-learning by using vector Q values in combination with a utility function, which captures the user's preference for action selection. This study follows on prior works, and focuses on what factors influence the frequency with which value-based MORL Q-learning algorithms learn the optimal policy for an environment with stochastic state transitions in scenarios where the goal is to maximise the Scalarised Expected Return (SER) - that is, to maximise the average outcome over multiple runs rather than the outcome within each individual episode. The analysis of the interaction between stochastic environment and MORL Q-learning algorithms run on a simple Multi-objective Markov decision process (MOMDP) Space Traders problem with different variant versions. The empirical evaluations
Authors
(none)
Tags
Stats
Related papers
- An Empirical Investigation Of Value-based Multi-objective Reinforcement Learning For Stochastic Environments (2024)0.00
- Issues With Value-based Multi-objective Reinforcement Learning: Value Function Interference And Overestimation Sensitivity (2024)0.00
- On Generalization Across Environments In Multi-objective Reinforcement Learning (2025)0.00
- Provable Multi-objective Reinforcement Learning With Generative Models (2020)0.00
- Limitations Of Scalarisation In MORL: A Comparative Study In Discrete Environments (2025)0.00
- Interpretability By Design For Efficient Multi-objective Reinforcement Learning (2025)0.00
- Utility-based Reinforcement Learning: Unifying Single-objective And Multi-objective Reinforcement Learning (2024)2.26
- Navigating Trade-offs: Policy Summarization For Multi-objective Reinforcement Learning (2024)2.26