Boosting Robustness In Preference-based Reinforcement Learning With Dynamic Sparsity
2024 Β· Calarina Muslimani, Bram Grooten, Deepak Ranganatha Sastry Mamillapalli, et al.
Abstract
To integrate into human-centered environments, autonomous agents must learn from and adapt to humans in their native settings. Preference-based reinforcement learning (PbRL) can enable this by learning reward functions from human preferences. However, humans live in a world full of diverse information, most of which is irrelevant to completing any particular task. It then becomes essential that agents learn to focus on the subset of task-relevant state features. To that end, this work proposes R2N (Robust-to-Noise), the first PbRL algorithm that leverages principles of dynamic sparse training to learn robust reward models that can focus on task-relevant features. In experiments with a simulated teacher, we demonstrate that R2N can adapt the sparse connectivity of its neural networks to focus on task-relevant features, enabling R2N to significantly outperform several sparse training and PbRL algorithms across simulated robotic environments.
Authors
(none)
Tags
Stats
Related papers
- Evaluating Feature Dependent Noise In Preference-based Reinforcement Learning (2026)0.00
- Ra-pbrl: Provably Efficient Risk-aware Preference-based Reinforcement Learning (2024)0.00
- Robust Deep Reinforcement Learning With Adaptive Adversarial Perturbations In Action Space (2024)6.20
- Data Driven Reward Initialization For Preference Based Reinforcement Learning (2023)0.00
- Distributionally Robust Self Paced Curriculum Reinforcement Learning (2025)0.00
- Reinforcement Learning From Diverse Human Preferences (2023)0.00
- Preference-based Multi-agent Reinforcement Learning: Data Coverage And Algorithmic Techniques (2024)0.00
- Hindsight Priors For Reward Learning From Human Preferences (2024)0.00