Mapping Out The Space Of Human Feedback For Reinforcement Learning: A Conceptual Framework
2024 · Yannick Metz, David Lindner, Raphaël Baur, et al.
Abstract
Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-tune or train agentic machine learning models. Similar to how humans interact in social contexts, we can use many types of feedback to communicate our preferences, intentions, and knowledge to an RL agent. However, applications of human feedback in RL are often limited in scope and disregard human factors. In this work, we bridge the gap between machine learning and human-computer interaction efforts by developing a shared understanding of human feedback in interactive learning scenarios. We first introduce a taxonomy of feedback types for reward-based learning from human feedback based on nine key dimensions. Our taxonomy allows for unifying human-centered, interface-centered, and model-centered aspects. In addition, we identify seven quality metrics of human feedback influencing both the human ability to express feedback and the agent's ability to learn from the feedback. Based on the feedback taxono
Authors
(none)
Tags
Stats
Related papers
- A Survey Of Reinforcement Learning From Human Feedback (2023)0.00
- A Survey On Enhancing Reinforcement Learning In Complex Environments: Insights From Human And LLM Feedback (2024)0.00
- Humans Are Not Boltzmann Distributions: Challenges And Opportunities For Modelling Human Feedback And Interaction In Reinforcement Learning (2022)0.00
- Perspectives On The Social Impacts Of Reinforcement Learning With Human Feedback (2023)0.00
- Towards Reinforcement Learning From Neural Feedback: Mapping Fnirs Signals To Agent Performance (2025)0.00
- Improving Multimodal Interactive Agents With Reinforcement Learning From Human Feedback (2022)0.00
- Aligning Humans And Robots Via Reinforcement Learning From Implicit Human Feedback (2025)2.26
- The Alignment Ceiling: Objective Mismatch In Reinforcement Learning From Human Feedback (2023)0.00