Reinforcement Learning With A Focus On Adjusting Policies To Reach Targets
2024 Β· Akane Tsuboya, Yu Kono, Tatsuji Takahashi
Abstract
The objective of a reinforcement learning agent is to discover better actions through exploration. However, typical exploration techniques aim to maximize rewards, often incurring high costs in both exploration and learning processes. We propose a novel deep reinforcement learning method, which prioritizes achieving an aspiration level over maximizing expected return. This method flexibly adjusts the degree of exploration based on the proportion of target achievement. Through experiments on a motion control task and a navigation task, this method achieved returns equal to or greater than other standard methods. The results of the analysis showed two things: our method flexibly adjusts the exploration scope, and it has the potential to enable the agent to adapt to non-stationary environments. These findings indicated that this method may have effectiveness in improving exploration efficiency in practical applications of reinforcement learning.
Authors
(none)
Tags
Stats
Related papers
- Learning When To Switch: Adaptive Policy Selection Via Reinforcement Learning (2025)0.00
- An Agent Design With Goal Reaching Guarantees For Enhancement Of Learning (2024)0.00
- Computationally Efficient Reinforcement Learning: Targeted Exploration Leveraging Simple Rules (2022)2.26
- Learning Adaptive Exploration Strategies In Dynamic Environments Through Informed Policy Regularization (2020)0.00
- Adapting Behaviour For Learning Progress (2019)0.00
- Reward-conditioned Policies (2019)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Dynamic Subgoal-based Exploration Via Bayesian Optimization (2019)0.00