Decision Making In Non-stationary Environments With Policy-augmented Search
2024 Β· Ava Pettet, Yunuo Zhang, Baiting Luo, et al.
Abstract
Sequential decision-making under uncertainty is present in many important problems. Two popular approaches for tackling such problems are reinforcement learning and online search (e.g., Monte Carlo tree search). While the former learns a policy by interacting with the environment (typically done before execution), the latter uses a generative model of the environment to sample promising action trajectories at decision time. Decision-making is particularly challenging in non-stationary environments, where the environment in which an agent operates can change over time. Both approaches have shortcomings in such settings -- on the one hand, policies learned before execution become stale when the environment changes and relearning takes both time and computational effort. Online search, on the other hand, can return sub-optimal actions when there are limitations on allowed runtime. In this paper, we introduce \textit\{Policy-Augmented Monte Carlo tree search\} (PA-MCTS), which combines act
Authors
(none)
Tags
Stats
Related papers
- Decision Making In Non-stationary Environments With Policy-augmented Monte Carlo Tree Search (2022)0.00
- Act As You Learn: Adaptive Decision-making In Non-stationary Markov Decision Processes (2024)0.00
- Maneuver Decision-making Through Proximal Policy Optimization And Monte Carlo Tree Search (2023)0.00
- Sequential Monte Carlo For Policy Optimization In Continuous Pomdps (2025)0.00
- Policy Gradient Search: Online Planning And Expert Iteration Without Search Trees (2019)0.00
- Policy Gradient Algorithms With Monte Carlo Tree Learning For Non-markov Decision Processes (2022)0.00
- Tree Search-based Policy Optimization Under Stochastic Execution Delay (2024)1.56
- Optimal Decision-making In Mixed-agent Partially Observable Stochastic Environments Via Reinforcement Learning (2019)0.00