Act As You Learn: Adaptive Decision-making In Non-stationary Markov Decision Processes
2024 Β· Baiting Luo, Yunuo Zhang, Abhishek Dubey, et al.
Abstract
A fundamental (and largely open) challenge in sequential decision-making is dealing with non-stationary environments, where exogenous environmental conditions change over time. Such problems are traditionally modeled as non-stationary Markov decision processes (NSMDP). However, existing approaches for decision-making in NSMDPs have two major shortcomings: first, they assume that the updated environmental dynamics at the current time are known (although future dynamics can change); and second, planning is largely pessimistic, i.e., the agent acts ``safely'' to account for the non-stationary evolution of the environment. We argue that both these assumptions are invalid in practice -- updated environmental conditions are rarely known, and as the agent interacts with the environment, it can learn about the updated dynamics and avoid being pessimistic, at least in states whose dynamics it is confident about. We present a heuristic search algorithm called \textit\{Adaptive Monte Carlo Tree S
Authors
(none)
Tags
Stats
Related papers
- Decision Making In Non-stationary Environments With Policy-augmented Search (2024)0.00
- Decision Making In Non-stationary Environments With Policy-augmented Monte Carlo Tree Search (2022)0.00
- Non-stationary Markov Decision Processes, A Worst-case Approach Using Model-based Reinforcement Learning, Extended Version (2019)0.00
- Markov Decision Processes Under External Temporal Processes (2023)0.00
- Reinforcement Learning In Switching Non-stationary Markov Decision Processes: Algorithms And Convergence Analysis (2025)0.00
- Minimum-delay Adaptation In Non-stationary Reinforcement Learning Via Online High-confidence Change-point Detection (2021)0.00
- Configurable Markov Decision Processes (2018)0.00
- A Method For The Online Construction Of The Set Of States Of A Markov Decision Process Using Answer Set Programming (2017)4.52