Abstract

This paper establishes the first almost sure convergence rate and the first maximal concentration bound with exponential tails for general contractive stochastic approximation algorithms with Markovian noise. As a corollary, we also obtain convergence rates in \(L^p\). Key to our successes is a novel discretization of the mean ODE of stochastic approximation algorithms using intervals with diminishing (instead of constant) length. As applications, we provide the first almost sure convergence rate for \(Q\)-learning with Markovian samples without count-based learning rates. We also provide the first concentration bound for off-policy temporal difference learning with Markovian samples.

Authors

(none)

Tags

  • Uncategorized

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keyqian2024almost

Related papers