Using Contrastive Samples For Identifying And Leveraging Possible Causal Relationships In Reinforcement Learning
2022 Β· Harshad Khadilkar, Hardik Meisheri
Abstract
A significant challenge in reinforcement learning is quantifying the complex relationship between actions and long-term rewards. The effects may manifest themselves over a long sequence of state-action pairs, making them hard to pinpoint. In this paper, we propose a method to link transitions with significant deviations in state with unusually large variations in subsequent rewards. Such transitions are marked as possible causal effects, and the corresponding state-action pairs are added to a separate replay buffer. In addition, we include \textit\{contrastive\} samples corresponding to transitions from a similar state but with differing actions. Including this Contrastive Experience Replay (CER) during training is shown to outperform standard value-based methods on 2D navigation tasks. We believe that CER can be useful for a broad class of learning tasks, including for any off-policy reinforcement learning algorithm.
Authors
(none)
Tags
Stats
Related papers
- CCLF: A Contrastive-curiosity-driven Learning Framework For Sample-efficient Reinforcement Learning (2022)7.16
- Experience Replay Using Transition Sequences (2017)8.82
- Introspective Experience Replay: Look Back When Surprised (2022)0.00
- Counterfactual Experience Augmented Off-policy Reinforcement Learning (2025)0.00
- CIER: A Novel Experience Replay Approach With Causal Inference In Deep Reinforcement Learning (2024)0.00
- Replay For Safety (2021)0.00
- Return-based Contrastive Representation Learning For Reinforcement Learning (2021)12.17
- Reducing Action Space For Deep Reinforcement Learning Via Causal Effect Estimation (2025)0.00