Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control In Computationally Complex Environments
2019 Β· Zhizheng Zhang, Jiale Chen, Zhibo Chen, et al.
Abstract
Deep Deterministic Policy Gradient (DDPG) has been proved to be a successful reinforcement learning (RL) algorithm for continuous control tasks. However, DDPG still suffers from data insufficiency and training inefficiency, especially in computationally complex environments. In this paper, we propose Asynchronous Episodic DDPG (AE-DDPG), as an expansion of DDPG, which can achieve more effective learning with less training time required. First, we design a modified scheme for data collection in an asynchronous fashion. Generally, for asynchronous RL algorithms, sample efficiency or/and training stability diminish as the degree of parallelism increases. We consider this problem from the perspectives of both data generation and data utilization. In detail, we re-design experience replay by introducing the idea of episodic control so that the agent can latch on good trajectories rapidly. In addition, we also inject a new type of noise in action space to enrich the exploration behaviors. Ex
Authors
(none)
Tags
Stats
Related papers
- ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm For Sparse Reward Continuous Control (2024)0.00
- Deterministic Policy Gradient For Reinforcement Learning With Continuous Time And State (2025)0.00
- DDPG++: Striving For Simplicity In Continuous-control Off-policy Reinforcement Learning (2020)0.00
- Improved Exploration Through Latent Trajectory Optimization In Deep Deterministic Policy Gradient (2019)0.00
- Deterministic Value-policy Gradients (2019)0.00
- 3DPG: Distributed Deep Deterministic Policy Gradient Algorithms For Networked Multi-agent Systems (2022)0.00
- Deep Reinforcement Learning With Feedback-based Exploration (2019)5.84
- Evolution-guided Policy Gradient In Reinforcement Learning (2018)0.00