Shaping Advice In Deep Reinforcement Learning
2022 Β· Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran
Abstract
Reinforcement learning involves agents interacting with an environment to complete tasks. When rewards provided by the environment are sparse, agents may not receive immediate feedback on the quality of actions that they take, thereby affecting learning of policies. In this paper, we propose to methods to augment the reward signal from the environment with an additional reward termed shaping advice in both single and multi-agent reinforcement learning. The shaping advice is specified as a difference of potential functions at consecutive time-steps. Each potential function is a function of observations and actions of the agents. The use of potential functions is underpinned by an insight that the total potential when starting from any state and returning to the same state is always equal to zero. We show through theoretical analyses and experimental validation that the shaping advice does not distract agents from completing tasks specified by the environment reward. Theoretically, we pr
Authors
(none)
Tags
Stats
Related papers
- Environment Shaping In Reinforcement Learning Using State Abstraction (2020)0.00
- Influencing Reinforcement Learning Through Natural Language Guidance (2021)0.00
- Subgoal-based Reward Shaping To Improve Efficiency In Reinforcement Learning (2021)0.00
- Highly Efficient Self-adaptive Reward Shaping For Reinforcement Learning (2024)0.00
- Bandit-based Policy Invariant Explicit Shaping For Incorporating External Advice In Reinforcement Learning (2023)0.00
- BAMDP Shaping: A Unified Framework For Intrinsic Motivation And Reward Shaping (2024)0.00
- Learning Shaping Strategies In Human-in-the-loop Interactive Reinforcement Learning (2018)0.00
- Learning To Shape Rewards Using A Game Of Two Partners (2021)0.00