Effects Of Sparse Rewards Of Different Magnitudes In The Speed Of Learning Of Model-based Actor Critic Methods
2020 Β· Juan Vargas, Lazar Andjelic, Amir Barati Farimani
Abstract
Actor critic methods with sparse rewards in model-based deep reinforcement learning typically require a deterministic binary reward function that reflects only two possible outcomes: if, for each step, the goal has been achieved or not. Our hypothesis is that we can influence an agent to learn faster by applying an external environmental pressure during training, which adversely impacts its ability to get higher rewards. As such, we deviate from the classical paradigm of sparse rewards and add a uniformly sampled reward value to the baseline reward to show that (1) sample efficiency of the training process can be correlated to the adversity experienced during training, (2) it is possible to achieve higher performance in less time and with less resources, (3) we can reduce the performance variability experienced seed over seed, (4) there is a maximum point after which more pressure will not generate better results, and (5) that random positive incentives have an adverse effect when usin
Authors
(none)
Tags
Stats
Related papers
- Monte Carlo Augmented Actor-critic For Sparse Reward Deep Reinforcement Learning From Suboptimal Demonstrations (2022)0.00
- Studying The Interplay Between The Actor And Critic Representations In Reinforcement Learning (2025)0.00
- Effects Of Spectral Normalization In Multi-agent Reinforcement Learning (2022)5.24
- Boosting Exploration In Actor-critic Algorithms By Incentivizing Plausible Novel States (2022)5.24
- Honey, I Shrunk The Actor: A Case Study On Preserving Performance With Smaller Actors In Actor-critic RL (2021)0.00
- Overestimation, Overfitting, And Plasticity In Actor-critic: The Bitter Lesson Of Reinforcement Learning (2024)0.00
- Adaptive Symmetric Reward Noising For Reinforcement Learning (2019)0.00
- ACE : Off-policy Actor-critic With Causality-aware Entropy Regularization (2024)0.00