A Snapshot Of Influence: A Local Data Attribution Framework For Online Reinforcement Learning
2025 Β· Yuzheng Hu, Fan Wu, Haotian Ye, et al.
Abstract
Online reinforcement learning (RL) excels in complex, safety-critical domains but suffers from sample inefficiency, training instability, and limited interpretability. Data attribution provides a principled way to trace model behavior back to training samples, yet existing methods assume fixed datasets, which is violated in online RL where each experience both updates the policy and shapes future data collection. In this paper, we initiate the study of data attribution for online RL, focusing on the widely used Proximal Policy Optimization (PPO) algorithm. We start by establishing a *local* attribution framework, interpreting model checkpoints with respect to the records in the recent training buffer. We design two target functions, capturing agent action and cumulative return respectively, and measure each record's contribution through gradient similarity between its training loss and these targets. We demonstrate the power of this framework through three concrete applications: diagno
Authors
(none)
Tags
Stats
Related papers
- Active Advantage-aligned Online Reinforcement Learning With Offline Data (2025)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Provable Domain Adaptation For Offline Reinforcement Learning With Limited Samples (2024)0.00
- Pessimism In The Face Of Confounders: Provably Efficient Offline Reinforcement Learning In Partially Observable Markov Decision Processes (2022)0.00
- Interpretable Performance Analysis Towards Offline Reinforcement Learning: A Dataset Perspective (2021)0.00
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00
- Prioritized Trajectory Replay: A Replay Memory For Data-driven Reinforcement Learning (2023)0.00