Learn Dynamic-aware State Embedding For Transfer Learning
2021 Β· Kaige Yang
Abstract
Transfer reinforcement learning aims to improve the sample efficiency of solving unseen new tasks by leveraging experiences obtained from previous tasks. We consider the setting where all tasks (MDPs) share the same environment dynamic except reward function. In this setting, the MDP dynamic is a good knowledge to transfer, which can be inferred by uniformly random policy. However, trajectories generated by uniform random policy are not useful for policy improvement, which impairs the sample efficiency severely. Instead, we observe that the binary MDP dynamic can be inferred from trajectories of any policy which avoids the need of uniform random policy. As the binary MDP dynamic contains the state structure shared over all tasks we believe it is suitable to transfer. Built on this observation, we introduce a method to infer the binary MDP dynamic on-line and at the same time utilize it to guide state embedding learning, which is then transferred to new tasks. We keep state embedding le
Authors
(none)
Tags
Stats
Related papers
- Dynamics-aware Embeddings (2019)0.00
- Contextual Policy Transfer In Reinforcement Learning Domains Via Deep Mixtures-of-experts (2020)0.00
- TEA: Trajectory Encoding Augmentation For Robust And Transferable Policies In Offline Reinforcement Learning (2024)0.00
- Contextual Pre-planning On Reward Machine Abstractions For Enhanced Transfer In Deep Reinforcement Learning (2023)2.26
- Single Episode Policy Transfer In Reinforcement Learning (2019)0.00
- Live In The Moment: Learning Dynamics Model Adapted To Evolving Policy (2022)0.00
- Transferable Reward Learning By Dynamics-agnostic Discriminator Ensemble (2022)0.00
- Deep Transfer \(q\)-learning For Offline Non-stationary Reinforcement Learning (2025)0.00