Improving Multimodal Interactive Agents With Reinforcement Learning From Human Feedback
2022 Β· Josh Abramson, Arun Ahuja, Federico Carnevale, et al.
Abstract
An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulated 3D world. We then asked annotators to record moments where they believed that agents either progressed toward or regressed from their human-instructed goal. Using this annotation data we leveraged a novel method - which we call "Inter-temporal Bradley-Terry" (IBT) modelling - to build a reward model that captures human judgments. Agents trained to optimise rewards delivered from IBT reward models improved with respect to all of our metrics, including subsequent human judgment during live interactions with agents. Altogether our results demonstrate how one can successfully leverage human judg
Authors
(none)
Tags
Stats
Related papers
- Mapping Out The Space Of Human Feedback For Reinforcement Learning: A Conceptual Framework (2024)0.00
- Human AI Interaction Loop Training: New Approach For Interactive Reinforcement Learning (2020)0.00
- Assessing Human Interaction In Virtual Reality With Continually Learning Prediction Agents Based On Reinforcement Learning Algorithms: A Pilot Study (2021)0.00
- A Survey Of Reinforcement Learning From Human Feedback (2023)0.00
- Towards Reinforcement Learning From Neural Feedback: Mapping Fnirs Signals To Agent Performance (2025)0.00
- A Survey On Enhancing Reinforcement Learning In Complex Environments: Insights From Human And LLM Feedback (2024)0.00
- Perspectives On The Social Impacts Of Reinforcement Learning With Human Feedback (2023)0.00
- Humans Are Not Boltzmann Distributions: Challenges And Opportunities For Modelling Human Feedback And Interaction In Reinforcement Learning (2022)0.00