DQN With Model-based Exploration: Efficient Learning On Environments With Sparse Rewards
2019 Β· Stephen Zhen Gou, Yuyang Liu
Abstract
We propose Deep Q-Networks (DQN) with model-based exploration, an algorithm combining both model-free and model-based approaches that explores better and learns environments with sparse rewards more efficiently. DQN is a general-purpose, model-free algorithm and has been proven to perform well in a variety of tasks including Atari 2600 games since it's first proposed by Minh et el. However, like many other reinforcement learning (RL) algorithms, DQN suffers from poor sample efficiency when rewards are sparse in an environment. As a result, most of the transitions stored in the replay memory have no informative reward signal, and provide limited value to the convergence and training of the Q-Network. However, one insight is that these transitions can be used to learn the dynamics of the environment as a supervised learning problem. The transitions also provide information of the distribution of visited states. Our algorithm utilizes these two observations to perform a one-step planning
Authors
(none)
Tags
Stats
Related papers
- Sampling Efficient Deep Reinforcement Learning Through Preference-guided Stochastic Exploration (2022)8.09
- Neighboring State-based Exploration For Reinforcement Learning (2022)0.00
- Multi-objective Model-based Policy Search For Data-efficient Learning With Sparse Rewards (2018)0.00
- On The Convergence And Sample Complexity Analysis Of Deep Q-networks With \(\epsilon\)-greedy Exploration (2023)3.58
- Dynamic Subgoal-based Exploration Via Bayesian Optimization (2019)0.00
- Long-term Visitation Value For Deep Exploration In Sparse Reward Reinforcement Learning (2020)7.24
- Langevin DQN (2020)0.00
- \(\beta\)-dqn: Improving Deep Q-learning By Evolving The Behavior (2025)0.00