Abstract

arXiv:2312.12339v3 Announce Type: replace Abstract: Understanding visual inputs for a given task amidst varied changes is a key challenge posed by visual reinforcement learning agents. We propose \textit\{Value Explicit Pretraining\} (VEP), a method that learns generalizable representations for transfer reinforcement learning. VEP enables efficient learning of new tasks that share similar objectives as previously learned tasks, by learning an encoder that trains representations to be invariant to changes in environment dynamics and appearance. To pretrain the encoder with \textit\{suboptimal unlabeled demonstration data\} (sequence of observations and sparse reward signals), we use a self-supervised contrastive loss that enables the model to relate states across different tasks based on the Monte Carlo value estimate that is reflective of task progress, resulting in temporally smooth representations that capture the objective of the task. A major difference between our method and the

Authors

(none)

Tags

  • Value-Based

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keylekkala2026value

Related papers