Learning The Minimum Action Distance
2025 · Lorenzo Steccanella, Joshua B. Evans, Özgür Şimşek, et al.
Abstract
This paper presents a state representation framework for Markov decision processes (MDPs) that can be learned solely from state trajectories, requiring neither reward signals nor the actions executed by the agent. We propose learning the minimum action distance (MAD), defined as the minimum number of actions required to transition between states, as a fundamental metric that captures the underlying structure of an environment. MAD naturally enables critical downstream tasks such as goal-conditioned reinforcement learning and reward shaping by providing a dense, geometrically meaningful measure of progress. Our self-supervised learning approach constructs an embedding space where the distances between embedded state pairs correspond to their MAD, accommodating both symmetric and asymmetric approximations. We evaluate the framework on a comprehensive suite of environments with known MAD values, encompassing both deterministic and stochastic dynamics, as well as discrete and continuous st
Authors
(none)
Tags
Stats
Related papers
- Low-dimensional State And Action Representation Learning With MDP Homomorphism Metrics (2021)0.00
- Learning Markov State Abstractions For Deep Reinforcement Learning (2021)0.00
- Parameterized Mdps And Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework (2020)8.60
- Learning Good State And Action Representations Via Tensor Decomposition (2021)2.26
- Bayesian Learning Of The Optimal Action-value Function In A Markov Decision Process (2025)0.00
- Mico: Improved Representations Via Sampling-based State Similarity For Markov Decision Processes (2021)0.00
- Using Forwards-backwards Models To Approximate MDP Homomorphisms (2022)0.00
- Learning Non-markovian Reward Models In Mdps (2020)0.00