Skill-aware Mutual Information Optimisation For Generalisation In Reinforcement Learning
2024 Β· Xuehui Yu, Mhairi Dunion, Xin Li, et al.
Abstract
Meta-Reinforcement Learning (Meta-RL) agents can struggle to operate across tasks with varying environmental features that require different optimal skills (i.e., different modes of behaviour). Using context encoders based on contrastive learning to enhance the generalisability of Meta-RL agents is now widely studied but faces challenges such as the requirement for a large sample size, also referred to as the \(log\)-\(K\) curse. To improve RL generalisation to different tasks, we first introduce Skill-aware Mutual Information (SaMI), an optimisation objective that aids in distinguishing context embeddings according to skills, thereby equipping RL agents with the ability to identify and execute different skills across tasks. We then propose Skill-aware Noise Contrastive Estimation (SaNCE), a \(K\)-sample estimator used to optimise the SaMI objective. We provide a framework for equipping an RL agent with SaNCE in practice and conduct experimental validation on modified MuJoCo and Panda-
Authors
(none)
Tags
Stats
Related papers
- Self-improving Skill Learning For Robust Skill-based Meta-reinforcement Learning (2025)0.00
- Can A MISL Fly? Analysis And Ingredients For Mutual Information Skill Learning (2024)0.00
- Improving Generalization In Meta Reinforcement Learning Using Learned Objectives (2019)0.00
- Decoupling Meta-reinforcement Learning With Gaussian Task Contexts And Skills (2023)2.26
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34
- Leveraging Approximate Symbolic Models For Reinforcement Learning Via Skill Diversity (2022)0.00
- Skills: Adaptive Skill Sequencing For Efficient Temporally-extended Exploration (2022)0.00
- Decoupling Exploration And Exploitation For Meta-reinforcement Learning Without Sacrifices (2020)0.00