On The Convergence And Sample Complexity Analysis Of Deep Q-networks With \(\epsilon\)-greedy Exploration
2023 Β· Shuai Zhang, Hongkang Li, Meng Wang, et al.
Abstract
This paper provides a theoretical understanding of Deep Q-Network (DQN) with the \(\epsilon\)-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored. First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However, the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with \(\epsilon\)-greedy policy. We prove an iterative procedure with decaying \(\epsilon
Authors
(none)
Tags
Stats
Related papers
- A Theoretical Analysis Of Deep Q-learning (2019)0.00
- Sampling Efficient Deep Reinforcement Learning Through Preference-guided Stochastic Exploration (2022)8.09
- Convergent And Efficient Deep Q Network Algorithm (2021)0.00
- Deep Q-learning: Theoretical Insights From An Asymptotic Analysis (2020)10.35
- DQN With Model-based Exploration: Efficient Learning On Environments With Sparse Rewards (2019)0.00
- \(\beta\)-dqn: Improving Deep Q-learning By Evolving The Behavior (2025)0.00
- Langevin DQN (2020)0.00
- Convergence Guarantees For Deep Epsilon Greedy Policy Learning (2021)0.00