Look Beneath The Surface: Exploiting Fundamental Symmetry For Sample-efficient Offline RL
2023 Β· Peng Cheng, Xianyuan Zhan, Zhihao Wu, et al.
Abstract
Offline reinforcement learning (RL) offers an appealing approach to real-world tasks by learning policies from pre-collected datasets without interacting with the environment. However, the performance of existing offline RL algorithms heavily depends on the scale and state-action space coverage of datasets. Real-world data collection is often expensive and uncontrollable, leading to small and narrowly covered datasets and posing significant challenges for practical deployments of offline RL. In this paper, we provide a new insight that leveraging the fundamental symmetry of system dynamics can substantially enhance offline RL performance under small datasets. Specifically, we propose a Time-reversal symmetry (T-symmetry) enforced Dynamics Model (TDM), which establishes consistency between a pair of forward and reverse latent dynamics. TDM provides both well-behaved representations for small datasets and a new reliability measure for OOD samples based on compliance with the T-symmetry.
Authors
(none)
Tags
Stats
Related papers
- Koopman Q-learning: Offline Reinforcement Learning Via Symmetries Of Dynamics (2021)0.00
- Beyond Uniform Sampling: Offline Reinforcement Learning With Imbalanced Datasets (2023)2.83
- Towards Data-driven Offline Simulations For Online Reinforcement Learning (2022)0.00
- Exploiting Symmetry In Dynamics For Model-based Reinforcement Learning With Asymmetric Rewards (2024)0.00
- Offline Policy Evaluation For Reinforcement Learning With Adaptively Collected Data (2023)0.00
- When To Trust Your Simulator: Dynamics-aware Hybrid Offline-and-online Reinforcement Learning (2022)2.26
- Leveraging Offline Data In Online Reinforcement Learning (2022)0.00
- Symmetric Replay Training: Enhancing Sample Efficiency In Deep Reinforcement Learning For Combinatorial Optimization (2023)0.00