Rating-based Reinforcement Learning
2023 Β· Devin White, Mingkang Wu, Ellen Novoseller, et al.
Abstract
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning From Diverse Human Preferences (2023)0.00
- Reward Learning From Human Preferences And Demonstrations In Atari (2018)0.00
- Reinforcement Learning With Human Advice: A Survey (2020)0.00
- A Survey Of Reinforcement Learning From Human Feedback (2023)0.00
- Ra-pbrl: Provably Efficient Risk-aware Preference-based Reinforcement Learning (2024)0.00
- Improving Multimodal Interactive Agents With Reinforcement Learning From Human Feedback (2022)0.00
- Principled Reinforcement Learning With Human Feedback From Pairwise Or \(k\)-wise Comparisons (2023)0.00
- Elo-rated Sequence Rewards: Advancing Reinforcement Learning Models (2024)0.00