Asymptotic Bias Of Stochastic Gradient Search
2017 Β· Vladislav B. Tadic, Arnaud Doucet
Abstract
The asymptotic behavior of the stochastic gradient algorithm with a biased gradient estimator is analyzed. Relying on arguments based on the dynamic system theory (chain-recurrence) and the differential geometry (Yomdin theorem and Lojasiewicz inequality), tight bounds on the asymptotic bias of the iterates generated by such an algorithm are derived. The obtained results hold under mild conditions and cover a broad class of high-dimensional nonlinear algorithms. Using these results, the asymptotic properties of the policy-gradient (reinforcement) learning and adaptive population Monte Carlo sampling are studied. Relying on the same results, the asymptotic behavior of the recursive maximum split-likelihood estimation in hidden Markov models is analyzed, too.
Authors
(none)
Tags
Stats
Related papers
- Non-asymptotic Analysis Of Biased Stochastic Approximation Scheme (2019)0.00
- On The Second-order Convergence Of Biased Policy Gradient Algorithms (2023)0.00
- Asynchronous Stochastic Approximations With Asymptotically Biased Errors And Deep Multi-agent Learning (2018)0.00
- On The Convergence Of Discounted Policy Gradient Methods (2022)0.00
- A Temporal-difference Approach To Policy Gradient Estimation (2022)0.00
- A Hybrid Stochastic Policy Gradient Algorithm For Reinforcement Learning (2020)0.00
- On The Convergence Of Consensus Algorithms With Markovian Noise And Gradient Bias (2020)0.00
- Constant Stepsize Q-learning: Distributional Convergence, Bias And Extrapolation (2024)0.00