Contrastive UCB: Provably Efficient Contrastive Self-supervised Learning In Online Reinforcement Learning
2022 Β· Shuang Qiu, Lingxiao Wang, Chenjia Bai, et al.
Abstract
In view of its power in extracting feature representation, contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL), leading to efficient policy learning in various applications. Despite its tremendous empirical successes, the understanding of contrastive learning for RL remains elusive. To narrow such a gap, we study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions. For both models, we propose to extract the correct feature representations of the low-rank model by minimizing a contrastive loss. Moreover, under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs. We further theoretically prove that our algorithm recovers the true representations and simultaneously achieves sample efficiency in learning the optimal
Authors
(none)
Tags
Stats
Related papers
- Robust Task Representations For Offline Meta-reinforcement Learning Via Contrastive Learning (2022)0.00
- Contrastive Learning As Goal-conditioned Reinforcement Learning (2022)0.00
- Model-enhanced Contrastive Reinforcement Learning For Sequential Recommendation (2023)0.00
- Reinforcement Learning: A Comparison Of UCB Versus Alternative Adaptive Policies (2019)0.00
- Return-based Contrastive Representation Learning For Reinforcement Learning (2021)12.17
- CCLF: A Contrastive-curiosity-driven Learning Framework For Sample-efficient Reinforcement Learning (2022)7.16
- Contrastive Diffuser: Planning Towards High Return States Via Contrastive Learning (2024)0.00
- Provably Improved Context-based Offline Meta-rl With Attention And Contrastive Learning (2021)0.00