Playvirtual: Augmenting Cycle-consistent Virtual Trajectories For Reinforcement Learning
2021 Β· Tao Yu, Cuiling Lan, Wenjun Zeng, et al.
Abstract
Learning good feature representations is important for deep reinforcement learning (RL). However, with limited experience, RL often suffers from data inefficiency for training. For un-experienced or less-experienced trajectories (i.e., state-action sequences), the lack of data limits the use of them for better feature learning. In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning. Specifically, PlayVirtual predicts future states in the latent space based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle. Based on this, we augment the actions to generate a large amount of virtual state-action trajectories. Being free of groudtruth state supervision, we enforce a trajectory to meet the cycle consistency constraint, which can significantly enhance the data effi
Authors
(none)
Tags
Stats
Related papers
- Value-consistent Representation Learning For Data-efficient Reinforcement Learning (2022)0.00
- Accelerating Representation Learning With View-consistent Dynamics In Data-efficient Reinforcement Learning (2022)0.00
- Prioritized Trajectory Replay: A Replay Memory For Data-driven Reinforcement Learning (2023)0.00
- Model-based Trajectory Stitching For Improved Offline Reinforcement Learning (2022)0.00
- Enhancing Offline Reinforcement Learning With Curriculum Learning-based Trajectory Valuation (2025)0.00
- Stable Continual Reinforcement Learning Via Diffusion-based Trajectory Replay (2024)0.00
- Data-efficient Reinforcement Learning With Self-predictive Representations (2020)0.00
- Atradiff: Accelerating Online Reinforcement Learning With Imaginary Trajectories (2024)0.00