Goal-driven Reward By Video Diffusion Models For Reinforcement Learning
2025 Β· Qi Wang, Mian Wu, Yuyang Zhang, et al.
Abstract
Reinforcement Learning (RL) has achieved remarkable success in various domains, yet it often relies on carefully designed programmatic reward functions to guide agent behavior. Designing such reward functions can be challenging and may not generalize well across different tasks. To address this limitation, we leverage the rich world knowledge contained in pretrained video diffusion models to provide goal-driven reward signals for RL agents without ad-hoc design of reward. Our key idea is to exploit off-the-shelf video diffusion models pretrained on large-scale video datasets as informative reward functions in terms of video-level and frame-level goals. For video-level rewards, we first finetune a pretrained video diffusion model on domain-specific datasets and then employ its video encoder to evaluate the alignment between the latent representations of agent's trajectories and the generated goal videos. To enable more fine-grained goal-achievement, we derive a frame-level goal by ident
Authors
(none)
Tags
Stats
Related papers
- Viva: Video-trained Value Functions For Guiding Online RL From Diverse Data (2025)0.00
- Reward Design For Reinforcement Learning Agents (2025)0.00
- Learning To Reach Goals Via Diffusion (2023)0.00
- Reward-directed Score-based Diffusion Models Via Q-learning (2024)0.00
- Scalable Agent Alignment Via Reward Modeling: A Research Direction (2018)0.00
- Diffusion Models For Reinforcement Learning: A Survey (2023)5.64
- Reward Models In Deep Reinforcement Learning: A Survey (2025)0.00
- Dense And Diverse Goal Coverage In Multi Goal Reinforcement Learning (2025)0.00