VASE: Variational Assorted Surprise Exploration For Reinforcement Learning
2019 Β· Haitao Xu, Brendan McCane, Lech Szymanski
Abstract
Exploration in environments with continuous control and sparse rewards remains a key challenge in reinforcement learning (RL). Recently, surprise has been used as an intrinsic reward that encourages systematic and efficient exploration. We introduce a new definition of surprise and its RL implementation named Variational Assorted Surprise Exploration (VASE). VASE uses a Bayesian neural network as a model of the environment dynamics and is trained using variational inference, alternately updating the accuracy of the agent's model and policy. Our experiments show that in continuous control sparse reward environments VASE outperforms other surprise-based exploration techniques.
Authors
(none)
Tags
Stats
Related papers
- VIME: Variational Information Maximizing Exploration (2016)0.00
- Beyond Surprise: Improving Exploration Through Surprise Novelty (2023)0.00
- Curiosity-driven Exploration Via Latent Bayesian Surprise (2021)0.00
- Generative Adversarial Exploration For Reinforcement Learning (2022)0.00
- REMAX: Relational Representation For Multi-agent Exploration (2020)2.26
- Learning In Volatile Environments With The Bayes Factor Surprise (2019)0.00
- Never Give Up: Learning Directed Exploration Strategies (2020)0.00
- Efficient And Robust Reinforcement Learning With Uncertainty-based Value Expansion (2019)0.00