CUDC: A Curiosity-driven Unsupervised Data Collection Method With Adaptive Temporal Distances For Offline Reinforcement Learning
2023 Β· Chenyu Sun, Hangwei Qian, Chunyan Miao
Abstract
Offline reinforcement learning (RL) aims to learn an effective policy from a pre-collected dataset. Most existing works are to develop sophisticated learning algorithms, with less emphasis on improving the data collection process. Moreover, it is even challenging to extend the single-task setting and collect a task-agnostic dataset that allows an agent to perform multiple downstream tasks. In this paper, we propose a Curiosity-driven Unsupervised Data Collection (CUDC) method to expand feature space using adaptive temporal distances for task-agnostic data collection and ultimately improve learning efficiency and capabilities for multi-task offline RL. To achieve this, CUDC estimates the probability of the k-step future states being reachable from the current states, and adapts how many steps into the future that the dynamics model should predict. With this adaptive reachability mechanism in place, the feature representation can be diversified, and the agent can navigate itself to colle
Authors
(none)
Tags
Stats
Related papers
- Learning From Sparse Offline Datasets Via Conservative Density Estimation (2024)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00
- Data Valuation For Offline Reinforcement Learning (2022)0.00
- D4RL: Datasets For Deep Data-driven Reinforcement Learning (2020)0.00
- Enhancing Offline Reinforcement Learning With Curriculum Learning-based Trajectory Valuation (2025)0.00
- Data-efficient Pipeline For Offline Reinforcement Learning With Limited Data (2022)0.00
- A Dataset Perspective On Offline Reinforcement Learning (2021)0.00