Count-based Exploration With The Successor Representation
2018 Β· Marlos C. MacHado, Marc G. Bellemare, Michael Bowling
Abstract
In this paper we introduce a simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required. Our approach is based on the successor representation (SR), which was originally introduced as a representation defining state generalization by the similarity of successor states. Here we show that the norm of the SR, while it is being learned, can be used as a reward bonus to incentivize exploration. In order to better understand this transient behavior of the norm of the SR we introduce the substochastic successor representation (SSR) and we show that it implicitly counts the number of times each state (or feature) has been observed. We use this result to introduce an algorithm that performs as well as some theoretically sample-efficient approaches. Finally, we extend these ideas to a deep RL algorithm and show that it achieves stat
Authors
(none)
Tags
Stats
Related papers
- Exploration In Feature Space For Reinforcement Learning (2017)0.00
- Provably Efficient Exploration For Reinforcement Learning Using Unsupervised Learning (2020)0.00
- Successor Uncertainties: Exploration And Uncertainty In Temporal Difference Learning (2018)0.00
- Neighboring State-based Exploration For Reinforcement Learning (2022)0.00
- Successor Feature Sets: Generalizing Successor Representations Across Policies (2021)5.84
- A Neurally Plausible Model Learns Successor Representations In Partially Observable Environments (2019)0.00
- Minimax-optimal Reward-agnostic Exploration In Reinforcement Learning (2023)0.00
- Approximate Exploration Through State Abstraction (2018)0.00