SMORE: Score Models For Offline Goal-conditioned Reinforcement Learning
2023 Β· Harshit Sikchi, Rohan Chitnis, Ahmed Touati, et al.
Abstract
Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve multiple goals in an environment purely from offline datasets using sparse reward functions. Offline GCRL is pivotal for developing generalist agents capable of leveraging pre-existing datasets to learn diverse and reusable skills without hand-engineering reward functions. However, contemporary approaches to GCRL based on supervised learning and contrastive learning are often suboptimal in the offline setting. An alternative perspective on GCRL optimizes for occupancy matching, but necessitates learning a discriminator, which subsequently serves as a pseudo-reward for downstream RL. Inaccuracies in the learned discriminator can cascade, negatively influencing the resulting policy. We present a novel approach to GCRL under a new lens of mixture-distribution matching, leading to our discriminator-free method: SMORe. The key insight is combining the occupancy matching perspective of GCRL with a conve
Authors
(none)
Tags
Stats
Related papers
- Provably Efficient Offline Goal-conditioned Reinforcement Learning With General Function Approximation And Single-policy Concentrability (2023)0.00
- Morel : Model-based Offline Reinforcement Learning (2020)0.00
- Ogbench: Benchmarking Offline Goal-conditioned RL (2024)0.00
- How Far I'll Go: Offline Goal-conditioned Reinforcement Learning Via \(f\)-advantage Regression (2022)0.00
- Goal-conditioned Data Augmentation For Offline Reinforcement Learning (2024)0.00
- Boosting Offline Reinforcement Learning With Residual Generative Modeling (2021)0.00
- SAMG: Offline-to-online Reinforcement Learning Via State-action-conditional Offline Model Guidance (2024)0.00
- Improving Zero-shot Generalization In Offline Reinforcement Learning Using Generalized Similarity Functions (2021)2.26