Learning Robust Reward Machines From Noisy Labels
2024 Β· Roko Parac, Lorenzo Nodari, Leo Ardon, et al.
Abstract
This paper presents PROB-IRM, an approach that learns robust reward machines (RMs) for reinforcement learning (RL) agents from noisy execution traces. The key aspect of RM-driven RL is the exploitation of a finite-state machine that decomposes the agent's task into different subtasks. PROB-IRM uses a state-of-the-art inductive logic programming framework robust to noisy examples to learn RMs from noisy traces using the Bayesian posterior degree of beliefs, thus ensuring robustness against inconsistencies. Pivotal for the results is the interleaving between RM learning and policy learning: a new RM is learned whenever the RL agent generates a trace that is believed not to be accepted by the current RM. To speed up the training of the RL agent, PROB-IRM employs a probabilistic formulation of reward shaping that uses the posterior Bayesian beliefs derived from the traces. Our experimental analysis shows that PROB-IRM can learn (potentially imperfect) RMs from noisy traces and exploit them
Authors
(none)
Tags
Stats
Related papers
- Learning Reward Machines: A Study In Partially Observable Reinforcement Learning (2021)0.00
- Inferring Probabilistic Reward Machines From Non-markovian Reward Processes For Reinforcement Learning (2021)0.00
- Non-markovian Reward Modelling From Trajectory Labels Via Interpretable Multiple Instance Learning (2022)0.00
- Reinforcement Learning With Perturbed Rewards (2018)13.74
- Robust Reinforcement Learning Using Least Squares Policy Iteration With Provable Performance Guarantees (2020)0.00
- Sample-efficient Robust Multi-agent Reinforcement Learning In The Face Of Environmental Uncertainty (2024)0.00
- A Bayesian Approach To Robust Reinforcement Learning (2019)0.00
- Quantifying First-order Markov Violations In Noisy Reinforcement Learning: A Causal Discovery Approach (2025)0.00