Exploiting Estimation Bias In Clipped Double Q-learning For Continous Control Reinforcement Learning Tasks
2024 · Niccolò Turcato, Alberto Sinigaglia, Alberto Dalla Libera, et al.
Abstract
Continuous control Deep Reinforcement Learning (RL) approaches are known to suffer from estimation biases, leading to suboptimal policies. This paper introduces innovative methods in RL, focusing on addressing and exploiting estimation biases in Actor-Critic methods for continuous control tasks, using Deep Double Q-Learning. We design a Bias Exploiting (BE) mechanism to dynamically select the most advantageous estimation bias during training of the RL agent. Most State-of-the-art Deep RL algorithms can be equipped with the BE mechanism, without hindering performance or computational complexity. Our extensive experiments across various continuous control tasks demonstrate the effectiveness of our approaches. We show that RL algorithms equipped with this method can match or surpass their counterparts, particularly in environments where estimation biases significantly impact learning. The results underline the importance of bias exploitation in improving policy learning in RL.
Authors
(none)
Tags
Stats
Related papers
- Simultaneous Double Q-learning With Conservative Advantage Learning For Actor-critic Methods (2022)0.00
- Automating Control Of Overestimation Bias For Reinforcement Learning (2021)0.00
- Action Candidate Based Clipped Double Q-learning For Discrete And Continuous Action Tasks (2021)0.00
- Mitigating Estimation Bias With Representation Learning In TD Error-driven Regularization (2025)0.00
- Estimation Error Correction In Deep Reinforcement Learning For Deterministic Actor-critic Methods (2021)7.16
- Action Candidate Driven Clipped Double Q-learning For Discrete And Continuous Action Tasks (2022)10.61
- On The Estimation Bias In Double Q-learning (2021)0.00
- Parameter-free Reduction Of The Estimation Bias In Deep Reinforcement Learning For Deterministic Policy Gradients (2021)0.00