Leveraging Human Feedback For Semantically-relevant Skill Discovery
2026 Β· Maxence Hussonnois, Thommen George Karimpanal, Santu Rana
Abstract
arXiv:2604.24127v1 Announce Type: cross Abstract: Unsupervised skill discovery in reinforcement learning aims to intrinsically motivate agents to discover diverse and useful behaviours. However, unconstrained approaches can produce unsafe, unethical, or misaligned behaviours. To mitigate these risks and improve the practical desireability of discovered skills, recent work grounds the discovery process by leveraging human preference feedback. However, preference-based approaches are feedback-inefficient and inherently ill-equipped to deal with skill spaces composed of a variety of different skills such as running, jumping, walking, etc. To overcome this limitation, we introduce semantic labelling, a novel and feedback-efficient approach that leverages human cognitive strengths to identify and label semantically meaningful behaviours. Based on semantic labelling, we propose Semantically Relevant Skill Discovery (SRSD), a novel human-in-the-loop approach that collects semantic labels fro
Authors
(none)
Tags
Stats
Related papers
- Diversity Is All You Need: Learning Skills Without A Reward Function (2018)0.00
- Hierarchical Cooperative Multi-agent Reinforcement Learning With Skill Discovery (2019)5.24
- Learning More Skills Through Optimistic Exploration (2021)0.00
- Data-dependent Exploration For Online Reinforcement Learning From Human Feedback (2026)0.00
- Disentangled Unsupervised Skill Discovery For Efficient Hierarchical Reinforcement Learning (2024)0.00
- Skild: Unsupervised Skill Discovery Guided By Factor Interactions (2024)0.00
- Mapping Out The Space Of Human Feedback For Reinforcement Learning: A Conceptual Framework (2024)0.00
- Skills: Adaptive Skill Sequencing For Efficient Temporally-extended Exploration (2022)0.00