How Far I'll Go: Offline Goal-conditioned Reinforcement Learning Via \(f\)-advantage Regression
2022 Β· Yecheng Jason Ma, Jason Yan, Dinesh Jayaraman, et al.
Abstract
Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill learning in the form of reaching diverse goals from purely offline datasets. We propose \(\textbf\{Go\}\)al-conditioned \(f\)-\(\textbf\{A\}\)dvantage \(\textbf\{R\}\)egression (GoFAR), a novel regression-based offline GCRL algorithm derived from a state-occupancy matching perspective; the key intuition is that the goal-reaching task can be formulated as a state-occupancy matching problem between a dynamics-abiding imitator agent and an expert agent that directly teleports to the goal. In contrast to prior approaches, GoFAR does not require any hindsight relabeling and enjoys uninterleaved optimization for its value and policy networks. These distinct features confer GoFAR with much better offline performance and stability as well as statistical performance guarantee that is unattainable for prior methods. Furthermore, we demonstrate that GoFAR's training objectives can be re-purposed to learn an agen
Authors
(none)
Tags
Stats
Related papers
- Provably Efficient Offline Goal-conditioned Reinforcement Learning With General Function Approximation And Single-policy Concentrability (2023)0.00
- SMORE: Score Models For Offline Goal-conditioned Reinforcement Learning (2023)0.00
- Ogbench: Benchmarking Offline Goal-conditioned RL (2024)0.00
- Goal-conditioned Data Augmentation For Offline Reinforcement Learning (2024)0.00
- Boosting Offline Reinforcement Learning With Residual Generative Modeling (2021)0.00
- Offline Fictitious Self-play For Competitive Games (2024)0.00
- Hundreds Guide Millions: Adaptive Offline Reinforcement Learning With Expert Guidance (2023)7.50
- FAWAC: Feasibility Informed Advantage Weighted Regression For Persistent Safety In Offline Reinforcement Learning (2024)0.00