Enhancing Hierarchical Reinforcement Learning Through Change Point Detection In Time Series
2025 Β· Hemanath Arumugam, Falong Fan, Bo Liu
Abstract
Hierarchical Reinforcement Learning (HRL) enhances the scalability of decision-making in long-horizon tasks by introducing temporal abstraction through options-policies that span multiple timesteps. Despite its theoretical appeal, the practical implementation of HRL suffers from the challenge of autonomously discovering semantically meaningful subgoals and learning optimal option termination boundaries. This paper introduces a novel architecture that integrates a self-supervised, Transformer-based Change Point Detection (CPD) module into the Option-Critic framework, enabling adaptive segmentation of state trajectories and the discovery of options. The CPD module is trained using heuristic pseudo-labels derived from intrinsic signals to infer latent shifts in environment dynamics without external supervision. These inferred change-points are leveraged in three critical ways: (i) to serve as supervisory signals for stabilizing termination function gradients, (ii) to pretrain intra-option
Authors
(none)
Tags
Stats
Related papers
- A Provably Efficient Option-based Algorithm For Both High-level And Low-level Learning (2024)0.00
- Learning And Exploiting Multiple Subgoals For Fast Exploration In Hierarchical Reinforcement Learning (2019)0.00
- Learning Representations In Model-free Hierarchical Reinforcement Learning (2018)11.49
- Boosting Hierarchical Reinforcement Learning With Meta-learning For Complex Task Adaptation (2024)0.00
- Hierarchical Decision Making Based On Structural Information Principles (2024)0.00
- Autonomous Option Invention For Continual Hierarchical Reinforcement Learning And Planning (2024)2.26
- MENTOR: Guiding Hierarchical Reinforcement Learning With Human Feedback And Dynamic Distance Constraint (2024)6.34
- Hierarchical Reinforcement Learning With Advantage-based Auxiliary Rewards (2019)0.00