Bitrajdiff: Bidirectional Trajectory Generation With Diffusion Models For Offline Reinforcement Learning
2025 Β· Yunpeng Qing, Yixiao Chi, Shuo Chen, et al.
Abstract
Recent advances in offline Reinforcement Learning (RL) have proven that effective policy learning can benefit from imposing conservative constraints on pre-collected datasets. However, such static datasets often exhibit distribution bias, resulting in limited generalizability. To address this limitation, a straightforward solution is data augmentation (DA), which leverages generative models to enrich data distribution. Despite the promising results, current DA techniques focus solely on reconstructing future trajectories from given states, while ignoring the exploration of history transitions that reach them. This single-direction paradigm inevitably hinders the discovery of diverse behavior patterns, especially those leading to critical states that may have yielded high-reward outcomes. In this work, we introduce Bidirectional Trajectory Diffusion (BiTrajDiff), a novel DA framework for offline RL that models both future and history trajectories from any intermediate states. Specifical
Authors
(none)
Tags
Stats
Related papers
- Atradiff: Accelerating Online Reinforcement Learning With Imaginary Trajectories (2024)0.00
- Diffstitch: Boosting Offline Reinforcement Learning With Diffusion-based Trajectory Stitching (2024)0.00
- Enhancing Decision Transformer With Diffusion-based Trajectory Branch Generation (2024)0.00
- Long-horizon Rollout Via Dynamics Diffusion For Offline Reinforcement Learning (2024)1.81
- Stable Continual Reinforcement Learning Via Diffusion-based Trajectory Replay (2024)0.00
- Enhancing Offline Reinforcement Learning With Curriculum Learning-based Trajectory Valuation (2025)0.00
- Diffpogan: Diffusion Policies With Generative Adversarial Networks For Offline Reinforcement Learning (2024)0.00
- Model-based Trajectory Stitching For Improved Offline Reinforcement Learning (2022)0.00