Autoregressive Policies For Continuous Control Deep Reinforcement Learning
2019 Β· Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, et al.
Abstract
Reinforcement learning algorithms rely on exploration to discover new behaviors, which is typically achieved by following a stochastic policy. In continuous control tasks, policies with a Gaussian distribution have been widely adopted. Gaussian exploration however does not result in smooth trajectories that generally correspond to safe and rewarding behaviors in practical tasks. In addition, Gaussian policies do not result in an effective exploration of an environment and become increasingly inefficient as the action rate increases. This contributes to a low sample efficiency often observed in learning continuous control tasks. We introduce a family of stationary autoregressive (AR) stochastic processes to facilitate exploration in continuous control domains. We show that proposed processes possess two desirable features: subsequent process observations are temporally coherent with continuously adjustable degree of coherence, and the process stationary distribution is standard normal.
Authors
(none)
Tags
Stats
Related papers
- Categorical Policies: Multimodal Policy Learning And Exploration In Continuous Control (2025)0.00
- Beyond Distributions: Geometric Action Control For Continuous Reinforcement Learning (2025)0.00
- Proximal Policy Optimization With Continuous Bounded Action Space Via The Beta Distribution (2021)0.00
- Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control In Computationally Complex Environments (2019)0.00
- Accuracy Of Discretely Sampled Stochastic Policies In Continuous-time Reinforcement Learning (2025)0.00
- Autoregressive Dynamics Models For Offline Policy Evaluation And Optimization (2021)0.00
- Attraction-repulsion Actor-critic For Continuous Control Reinforcement Learning (2019)0.00
- Model-based Reinforcement Learning For Control Under Time-varying Dynamics (2026)0.00