Remember And Forget For Experience Replay
2018 Β· Guido Novati, Petros Koumoutsakos
Abstract
Experience replay (ER) is a fundamental component of off-policy deep reinforcement learning (RL). ER recalls experiences from past iterations to compute gradient estimates for the current policy, increasing data-efficiency. However, the accuracy of such updates may deteriorate when the policy diverges from past behaviors and can undermine the performance of ER. Many algorithms mitigate this issue by tuning hyper-parameters to slow down policy changes. An alternative is to actively enforce the similarity between policy and the experiences in the replay memory. We introduce Remember and Forget Experience Replay (ReF-ER), a novel method that can enhance RL algorithms with parameterized policies. ReF-ER (1) skips gradients computed from experiences that are too unlikely with the current policy and (2) regulates policy changes within a trust region of the replayed behaviors. We couple ReF-ER with Q-learning, deterministic policy gradient and off-policy gradient methods. We find that ReF-ER
Authors
(none)
Tags
Stats
Related papers
- Lucid Dreaming For Experience Replay: Refreshing Past States With The Current Policy (2020)7.81
- Replay For Safety (2021)0.00
- Introspective Experience Replay: Look Back When Surprised (2022)0.00
- Replay-enhanced Continual Reinforcement Learning (2023)0.00
- Regret Minimization Experience Replay In Off-policy Reinforcement Learning (2021)0.00
- Stratified Experience Replay: Correcting Multiplicity Bias In Off-policy Reinforcement Learning (2021)0.00
- CUER: Corrected Uniform Experience Replay For Off-policy Continuous Deep Reinforcement Learning Algorithms (2024)0.00
- Map-based Experience Replay: A Memory-efficient Solution To Catastrophic Forgetting In Reinforcement Learning (2023)9.23