Perspectives On The Social Impacts Of Reinforcement Learning With Human Feedback
2023 Β· Gabrielle Kaili-May Liu
Abstract
Is it possible for machines to think like humans? And if it is, how should we go about teaching them to do so? As early as 1950, Alan Turing stated that we ought to teach machines in the way of teaching a child. Reinforcement learning with human feedback (RLHF) has emerged as a strong candidate toward allowing agents to learn from human feedback in a naturalistic manner. RLHF is distinct from traditional reinforcement learning as it provides feedback from a human teacher in addition to a reward signal. It has been catapulted into public view by multiple high-profile AI applications, including OpenAI's ChatGPT, DeepMind's Sparrow, and Anthropic's Claude. These highly capable chatbots are already overturning our understanding of how AI interacts with humanity. The wide applicability and burgeoning success of RLHF strongly motivate the need to evaluate its social impacts. In light of recent developments, this paper considers an important question: can RLHF be developed and used without ne
Authors
(none)
Tags
Stats
Related papers
- Mapping Out The Space Of Human Feedback For Reinforcement Learning: A Conceptual Framework (2024)0.00
- A Survey Of Reinforcement Learning From Human Feedback (2023)0.00
- Humans Are Not Boltzmann Distributions: Challenges And Opportunities For Modelling Human Feedback And Interaction In Reinforcement Learning (2022)0.00
- Improving Multimodal Interactive Agents With Reinforcement Learning From Human Feedback (2022)0.00
- Human AI Interaction Loop Training: New Approach For Interactive Reinforcement Learning (2020)0.00
- Reinforcement Learning In The Era Of Llms: What Is Essential? What Is Needed? An RL Perspective On RLHF, Prompting, And Beyond (2023)0.00
- Aligning Humans And Robots Via Reinforcement Learning From Implicit Human Feedback (2025)2.26
- Can You See How I Learn? Human Observers' Inferences About Reinforcement Learning Agents' Learning Processes (2025)0.00