Subgoal-based Reward Shaping To Improve Efficiency In Reinforcement Learning
2021 Β· Takato Okudo, Seiji Yamada
Abstract
Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. Many studies have incorporated human knowledge into reinforcement Learning. Though human knowledge on trajectories is often used, a human could be asked to control an AI agent, which can be difficult. Knowledge on subgoals may lessen this requirement because humans need only to consider a few representative states on an optimal trajectory in their minds. The essential factor for learning efficiency is rewards. Potential-based reward shaping is a basic method for enriching rewards. However, it is often difficult to incorporate subgoals for accelerating learning over potential-based reward shaping. This is because the appropriate potentials are not intuitive for humans. We extend potential-based reward shaping and propose a subgoa
Authors
(none)
Tags
Stats
Related papers
- Reward Shaping With Dynamic Trajectory Aggregation (2021)0.00
- A New Potential-based Reward Shaping For Reinforcement Learning Agent (2019)0.00
- Shaping Advice In Deep Reinforcement Learning (2022)0.00
- Highly Efficient Self-adaptive Reward Shaping For Reinforcement Learning (2024)0.00
- Environment Shaping In Reinforcement Learning Using State Abstraction (2020)0.00
- Reward Shaping For Human Learning Via Inverse Reinforcement Learning (2020)0.00
- Learning To Shape Rewards Using A Game Of Two Partners (2021)0.00
- BAMDP Shaping: A Unified Framework For Intrinsic Motivation And Reward Shaping (2024)0.00