Shaping Sparse Rewards In Reinforcement Learning: A Semi-supervised Approach
2025 Β· Wenyun Li, Wenjie Huang, Chen Sun
Abstract
In many real-world scenarios, reward signal for agents are exceedingly sparse, making it challenging to learn an effective reward function for reward shaping. To address this issue, the proposed approach in this paper performs reward shaping not only by utilizing non-zero-reward transitions but also by employing the *Semi-Supervised Learning* (SSL) technique combined with a novel data augmentation to learn trajectory space representations from the majority of transitions, \{i.e\}., zero-reward transitions, thereby improving the efficacy of reward shaping. Experimental results in Atari and robotic manipulation demonstrate that our method outperforms supervised-based approaches in reward inference, leading to higher agent scores. Notably, in more sparse-reward environments, our method achieves up to twice the peak scores compared to supervised baselines. The proposed double entropy data augmentation enhances performance, showcasing a 15.8% increase in best score over other augmentation m
Authors
(none)
Tags
Stats
Related papers
- Highly Efficient Self-adaptive Reward Shaping For Reinforcement Learning (2024)0.00
- Action Guidance: Getting The Best Of Sparse Rewards And Shaped Rewards For Real-time Strategy Games (2020)0.00
- Learning To Shape Rewards Using A Game Of Two Partners (2021)0.00
- Reward Shaping For Happier Autonomous Cyber Security Agents (2023)9.23
- FRESH: Interactive Reward Shaping In High-dimensional State Spaces Using Human Feedback (2020)0.00
- Shaping Advice In Deep Reinforcement Learning (2022)0.00
- Unpacking Reward Shaping: Understanding The Benefits Of Reward Engineering On Sample Complexity (2022)4.52
- Zero Shot Coordination For Sparse Reward Tasks With Diverse Reward Shapings (2026)0.00