An Advantage Based Policy Transfer Algorithm For Reinforcement Learning With Measures Of Transferability
2023 Β· Md Ferdous Alam, Parinaz Naghizadeh, David Hoelzle
Abstract
Reinforcement learning (RL) enables sequential decision-making in complex and high-dimensional environments through interaction with the environment. In most real-world applications, however, a high number of interactions are infeasible. In these environments, transfer RL algorithms, which can be used for the transfer of knowledge from one or multiple source environments to a target environment, have been shown to increase learning speed and improve initial and asymptotic performance. However, most existing transfer RL algorithms are on-policy and sample inefficient, fail in adversarial target tasks, and often require heuristic choices in algorithm design. This paper proposes an off-policy Advantage-based Policy Transfer algorithm, APT-RL, for fixed domain environments. Its novelty is in using the popular notion of ``advantage'' as a regularizer, to weigh the knowledge that should be transferred from the source, relative to new knowledge learned in the target, removing the need for heu
Authors
(none)
Tags
Stats
Related papers
- Single Episode Policy Transfer In Reinforcement Learning (2019)0.00
- Post-convergence Sim-to-real Policy Transfer: A Principled Alternative To Cherry-picking (2025)0.00
- IOB: Integrating Optimization Transfer And Behavior Transfer For Multi-policy Reuse (2023)5.24
- MULTIPOLAR: Multi-source Policy Aggregation For Transfer Reinforcement Learning Between Diverse Environmental Dynamics (2019)7.81
- Adarl: What, Where, And How To Adapt In Transfer Reinforcement Learning (2021)0.00
- Can RLHF Be More Efficient With Imperfect Reward Models? A Policy Coverage Perspective (2025)0.00
- Diversity For Contingency: Learning Diverse Behaviors For Efficient Adaptation And Transfer (2023)0.00
- Toward Robust Long Range Policy Transfer (2021)0.00