Composite Flow Matching For Reinforcement Learning With Shifted-dynamics Data
2025 Β· Lingkai Kong, Haichuan Wang, Tonghan Wang, et al.
Abstract
Incorporating pre-collected offline data can substantially improve the sample efficiency of reinforcement learning (RL), but its benefits can break down when the transition dynamics in the offline dataset differ from those encountered online. Existing approaches typically mitigate this issue by penalizing or filtering offline transitions in regions with large dynamics gap. However, their dynamics-gap estimators often rely on KL divergence or mutual information, which can be ill-defined when offline and online dynamics have mismatched support. To address this challenge, we propose CompFlow, a principled framework built on the theoretical connection between flow matching and optimal transport. Specifically, we model the online dynamics as a conditional flow built upon the output distribution of a pretrained offline flow, rather than learning it directly from a Gaussian prior. This composite structure provides two advantages: (1) improved generalization when learning online dynamics under
Authors
(none)
Tags
Stats
Related papers
- Controllable Flow Matching For Online Reinforcement Learning (2025)0.00
- Reverse Flow Matching: A Unified Framework For Online Reinforcement Learning With Diffusion And Flow Policies (2026)0.00
- Evolving Diffusion And Flow Matching Policies For Online Reinforcement Learning (2025)0.00
- FM-IRL: Flow-matching For Reward Modeling And Policy Regularization In Reinforcement Learning (2025)0.00
- Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency From Shifted-dynamics Data (2024)0.00
- Flow To Control: Offline Reinforcement Learning With Lossless Primitive Discovery (2022)3.58
- Guided Flow Policy: Learning From High-value Actions In Offline Reinforcement Learning (2025)0.00
- Bridging Distributionally Robust Learning And Offline RL: An Approach To Mitigate Distribution Shift And Partial Data Coverage (2023)0.00