Reinforcement Learning With Policy Mixture Model For Temporal Point Processes Clustering
2019 Β· Weichang Wu, Junchi Yan, Xiaokang Yang, et al.
Abstract
Temporal point process is an expressive tool for modeling event sequences over time. In this paper, we take a reinforcement learning view whereby the observed sequences are assumed to be generated from a mixture of latent policies. The purpose is to cluster the sequences with different temporal patterns into the underlying policies while learning each of the policy model. The flexibility of our model lies in: i) all the components are networks including the policy network for modeling the intensity function of temporal point process; ii) to handle varying-length event sequences, we resort to inverse reinforcement learning by decomposing the observed sequence into states (RNN hidden embedding of history) and actions (time interval to next event) in order to learn the reward function, thus achieving better performance or increasing efficiency compared to existing methods using rewards over the entire sequence such as log-likelihood or Wasserstein distance. We adopt an expectation-maximiz
Authors
(none)
Tags
Stats
Related papers
- Learning Temporal Point Processes Via Reinforcement Learning (2018)0.00
- Deep Reinforcement Learning Of Marked Temporal Point Processes (2018)0.00
- Markov Decision Processes Under External Temporal Processes (2023)0.00
- Reinforcement Learning In Reward-mixing Mdps (2021)0.00
- Imitation Learning Of Neural Spatio-temporal Point Processes (2019)9.23
- Finite-time Performance Of Distributed Temporal Difference Learning With Linear Function Approximation (2019)9.59
- An MRP Formulation For Supervised Learning: Generalized Temporal Difference Learning Models (2024)0.00
- Doubly Inhomogeneous Reinforcement Learning (2022)0.00