There Is No Turning Back: A Self-supervised Approach For Reversibility-aware Reinforcement Learning
2021 Β· Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, et al.
Abstract
We propose to learn to distinguish reversible from irreversible actions for better informed decision-making in Reinforcement Learning (RL). From theoretical considerations, we show that approximate reversibility can be learned through a simple surrogate task: ranking randomly sampled trajectory events in chronological order. Intuitively, pairs of events that are always observed in the same order are likely to be separated by an irreversible sequence of actions. Conveniently, learning the temporal order of events can be done in a fully self-supervised way, which we use to estimate the reversibility of actions from experience, without any priors. We propose two different strategies that incorporate reversibility in RL agents, one strategy for exploration (RAE) and one strategy for control (RAC). We demonstrate the potential of reversibility-aware agents in several environments, including the challenging Sokoban game. In synthetic tasks, we show that we can learn control policies that nev
Authors
(none)
Tags
Stats
Related papers
- Forward-backward Reinforcement Learning (2018)0.00
- Leave No Trace: Learning To Reset For Safe And Autonomous Reinforcement Learning (2017)0.00
- Backplay: "man Muss Immer Umkehren" (2018)0.00
- Rewriting History With Inverse RL: Hindsight Inference For Policy Improvement (2020)0.00
- All You Need Is Supervised Learning: From Imitation Learning To Meta-rl With Upside Down RL (2022)0.00
- REACT: Revealing Evolutionary Action Consequence Trajectories For Interpretable Reinforcement Learning (2024)2.26
- Backward Curriculum Reinforcement Learning (2022)0.00
- Reverse Forward Curriculum Learning For Extreme Sample And Demonstration Efficiency In Reinforcement Learning (2024)0.00