On The Sample Complexity And Metastability Of Heavy-tailed Policy Search In Continuous Control
2021 Β· Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, et al.
Abstract
Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model. Due to its scaling to continuous spaces, we focus on policy search where one iteratively improves a parameterized policy with stochastic policy gradient (PG) updates. In tabular Markov Decision Problems (MDPs), under persistent exploration and suitable parameterization, global optimality may be obtained. By contrast, in continuous space, the non-convexity poses a pathological challenge as evidenced by existing convergence results being mostly limited to stationarity or arbitrary local extrema. To close this gap, we step towards persistent exploration in continuous space through policy parameterizations defined by distributions of heavier tails defined by tail-index parameter alpha, which increases the likelihood of jumping in state space. Doing so invalidates smoothness conditions of the score function common to PG. Thus, we establish
Authors
(none)
Tags
Stats
Related papers
- Policy Optimization In A Noisy Neighborhood: On Return Landscapes In Continuous Control (2023)0.00
- Learning Optimal Deterministic Policies With Stochastic Policy Gradients (2024)0.00
- Categorical Policies: Multimodal Policy Learning And Exploration In Continuous Control (2025)0.00
- Accuracy Of Discretely Sampled Stochastic Policies In Continuous-time Reinforcement Learning (2025)0.00
- Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity (2021)0.00
- Policy Search By Target Distribution Learning For Continuous Control (2019)3.58
- Extremum-seeking Action Selection For Accelerating Policy Optimization (2024)0.00
- Full Error Analysis Of Policy Gradient Learning Algorithms For Exploratory Linear Quadratic Mean-field Control Problem In Continuous Time With Common Noise (2024)0.00