Replicability In Reinforcement Learning
2023 Β· Amin Karbasi, Grigoris Velegkas, Lin F. Yang, et al.
Abstract
We initiate the mathematical study of replicability as an algorithmic property in the context of reinforcement learning (RL). We focus on the fundamental setting of discounted tabular MDPs with access to a generative model. Inspired by Impagliazzo et al. [2022], we say that an RL algorithm is replicable if, with high probability, it outputs the exact same policy after two executions on i.i.d. samples drawn from the generator when its internal randomness is the same. We first provide an efficient \(\rho\)-replicable algorithm for \((\epsilon, \delta)\)-optimal policy estimation with sample and time complexity \(\widetilde O\left(\frac\{N^3\cdotlog(1/\delta)\}\{(1-\gamma)^5\cdot\epsilon^2\cdot\rho^2\}\right)\), where \(N\) is the number of state-action pairs. Next, for the subclass of deterministic algorithms, we provide a lower bound of order \(Ξ©\left(\frac\{N^3\}\{(1-\gamma)^3\cdot\epsilon^2\cdot\rho^2\}\right)\). Then, we study a relaxed version of replicability proposed by Kalavasis
Authors
(none)
Tags
Stats
Related papers
- From Generative To Episodic: Sample-efficient Replicable Reinforcement Learning (2025)0.00
- List Replicable Reinforcement Learning (2025)0.00
- Replicable Reinforcement Learning (2023)0.00
- Replicable Reinforcement Learning With Linear Function Approximation (2025)5.24
- Beyond Expected Return: Accounting For Policy Reproducibility When Evaluating Reinforcement Learning Algorithms (2023)3.58
- Deterministic Implementations For Reproducibility In Deep Reinforcement Learning (2018)0.00
- A Few Expert Queries Suffices For Sample-efficient RL With Resets And Linear Value Approximation (2022)0.00
- Offline Reinforcement Learning Under Value And Density-ratio Realizability: The Power Of Gaps (2022)0.00