Sample Efficiency In Sparse Reinforcement Learning: Or Your Money Back
2020 Β· Trevor A. McInroe
Abstract
Sparse rewards present a difficult problem in reinforcement learning and may be inevitable in certain domains with complex dynamics such as real-world robotics. Hindsight Experience Replay (HER) is a recent replay memory development that allows agents to learn in sparse settings by altering memories to show them as successful even though they may not be. While, empirically, HER has shown some success, it does not provide guarantees around the makeup of samples drawn from an agent's replay memory. This may result in minibatches that contain only memories with zero-valued rewards or agents learning an undesirable policy that completes HER-adjusted goals instead of the actual goal. In this paper, we introduce Or Your Money Back (OYMB), a replay memory sampler designed to work with HER. OYMB improves training efficiency in sparse settings by providing a direct interface to the agent's replay memory that allows for control over minibatch makeup, as well as a preferential lookup scheme tha
Authors
(none)
Tags
Stats
Related papers
- Bias-reduced Hindsight Experience Replay With Virtual Goal Prioritization (2019)9.41
- Large Batch Experience Replay (2021)0.00
- Regret Minimization Experience Replay In Off-policy Reinforcement Learning (2021)0.00
- Replay For Safety (2021)0.00
- Introspective Experience Replay: Look Back When Surprised (2022)0.00
- Backplay: "man Muss Immer Umkehren" (2018)0.00
- Frugal Actor-critic: Sample Efficient Off-policy Deep Reinforcement Learning Using Unique Experiences (2024)0.00
- Memory Based Trajectory-conditioned Policies For Learning From Sparse Rewards (2019)0.00