Stable Continual Reinforcement Learning Via Diffusion-based Trajectory Replay

Abstract

Given the inherent non-stationarity prevalent in real-world applications, continual Reinforcement Learning (RL) aims to equip the agent with the capability to address a series of sequentially presented decision-making tasks. Within this problem setting, a pivotal challenge revolves around \textit\{catastrophic forgetting\} issue, wherein the agent is prone to effortlessly erode the decisional knowledge associated with past encountered tasks when learning the new one. In recent progresses, the \textit\{generative replay\} methods have showcased substantial potential by employing generative models to replay data distribution of past tasks. Compared to storing the data from past tasks directly, this category of methods circumvents the growing storage overhead and possible data privacy concerns. However, constrained by the expressive capacity of generative models, existing \textit\{generative replay\} methods face challenges in faithfully reconstructing the data distribution of past tasks,

Stable Continual Reinforcement Learning Via Diffusion-based Trajectory Replay

Abstract

Authors

Tags

Stats

Related papers