Reward Shaping For Human Learning Via Inverse Reinforcement Learning
2020 Β· Mark A. Rucker, Layne T. Watson, Matthew S. Gerber, et al.
Abstract
Humans are spectacular reinforcement learners, constantly learning from and adjusting to experience and feedback. Unfortunately, this doesn't necessarily mean humans are fast learners. When tasks are challenging, learning can become unacceptably slow. Fortunately, humans do not have to learn tabula rasa, and learning speed can be greatly increased with learning aids. In this work we validate a new type of learning aid -- reward shaping for humans via inverse reinforcement learning (IRL). The goal of this aid is to increase the speed with which humans can learn good policies for specific tasks. Furthermore this approach compliments alternative machine learning techniques such as safety features that try to prevent individuals from making poor decisions. To achieve our results we first extend a well known IRL algorithm via kernel methods. Afterwards we conduct two human subjects experiments using an online game where players have limited time to learn a good policy. We show with statisti
Authors
(none)
Tags
Stats
Related papers
- Accounting For Human Learning When Inferring Human Preferences (2020)0.00
- Learning Shaping Strategies In Human-in-the-loop Interactive Reinforcement Learning (2018)0.00
- Modeling And Interpreting Real-world Human Risk Decision Making With Inverse Reinforcement Learning (2019)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00
- Subgoal-based Reward Shaping To Improve Efficiency In Reinforcement Learning (2021)0.00
- Highly Efficient Self-adaptive Reward Shaping For Reinforcement Learning (2024)0.00
- Iterative Reward Shaping Using Human Feedback For Correcting Reward Misspecification (2023)4.52
- ORSO: Accelerating Reward Design Via Online Reward Selection And Policy Optimization (2024)0.00