Concentration Of Contractive Stochastic Approximation And Reinforcement Learning

Abstract

Using a martingale concentration inequality, concentration bounds `from time \(n_0\) on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

Concentration Of Contractive Stochastic Approximation And Reinforcement Learning

Abstract

Authors

Tags

Stats

Related papers