Long N-step Surrogate Stage Reward To Reduce Variances Of Deep Reinforcement Learning In Complex Problems
2022 Β· Junmin Zhong, Ruofan Wu, Jennie Si
Abstract
High variances in reinforcement learning have shown impeding successful convergence and hurting task performance. As reward signal plays an important role in learning behavior, multi-step methods have been considered to mitigate the problem, and are believed to be more effective than single step methods. However, there is a lack of comprehensive and systematic study on this important aspect to demonstrate the effectiveness of multi-step methods in solving highly complex continuous control problems. In this study, we introduce a new long \(N\)-step surrogate stage (LNSS) reward approach to effectively account for complex environment dynamics while previous methods are usually feasible for limited number of steps. The LNSS method is simple, low computational cost, and applicable to value based or policy gradient reinforcement learning. We systematically evaluate LNSS in OpenAI Gym and DeepMind Control Suite to address some complex benchmark environments that have been challenging to obta
Authors
(none)
Tags
Stats
Related papers
- Understanding Multi-step Deep Reinforcement Learning: A Systematic Study Of The DQN Target (2019)0.00
- One Step At A Time: Pros And Cons Of Multi-step Meta-gradient Reinforcement Learning (2021)0.00
- Adaptive Symmetric Reward Noising For Reinforcement Learning (2019)0.00
- Elastic Step DQN: A Novel Multi-step Algorithm To Alleviate Overestimation In Deep Qnetworks (2022)10.85
- A Survey On Enhancing Reinforcement Learning In Complex Environments: Insights From Human And LLM Feedback (2024)0.00
- Multi-step Greedy Reinforcement Learning Algorithms (2019)0.00
- Natural Policy Gradient For Average Reward Non-stationary RL (2025)0.00
- Generalizing Across Multi-objective Reward Functions In Deep Reinforcement Learning (2018)0.00