A Kernel Perspective On Behavioural Metrics For Markov Decision Processes
2023 Β· Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, et al.
Abstract
Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning. We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We leverage this new perspective to define a new metric that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective further enables us to provide new theoretical results, which has so far eluded prior work. These include bounding value function differences by means of our metric, and the demonstration that our metric can be provably embedded into a finite-dimensional Euclidean space with low distortion error. These are two crucial properties when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.
Authors
(none)
Tags
Stats
Related papers
- A Kernel-based Approach To Non-stationary Reinforcement Learning In Metric Spaces (2020)0.00
- Mico: Improved Representations Via Sampling-based State Similarity For Markov Decision Processes (2021)0.00
- Metrics And Continuity In Reinforcement Learning (2021)0.00
- Understanding Behavioral Metric Learning: A Large-scale Study On Distracting Reinforcement Learning Environments (2025)0.00
- Kernel Metric Learning For In-sample Off-policy Evaluation Of Deterministic RL Policies (2024)0.00
- Local Metric Learning For Off-policy Evaluation In Contextual Bandits With Continuous Actions (2022)0.00
- Optimal Policy Evaluation Using Kernel-based Temporal Difference Methods (2021)0.00
- Learning Two-player Mixture Markov Games: Kernel Function Approximation And Correlated Equilibrium (2022)0.00