Average Reward Reinforcement Learning For Omega-regular And Mean-payoff Objectives
2025 Β· Milad Kazemi, Mateo Perez, Fabio Somenzi, et al.
Abstract
Recent advances in reinforcement learning (RL) have renewed interest in reward design for shaping agent behavior, but manually crafting reward functions is tedious and error-prone. A principled alternative is to specify behavioral requirements in a formal, unambiguous language and automatically compile them into learning objectives. \(\omega\)-regular languages are a natural fit, given their role in formal verification and synthesis. However, most existing \(\omega\)-regular RL approaches operate in an episodic, discounted setting with periodic resets, which is misaligned with \(\omega\)-regular semantics over infinite traces. For continuing tasks, where the agent interacts with the environment over a single uninterrupted lifetime, the average-reward criterion is more appropriate. We focus on absolute liveness specifications, a subclass of \(\omega\)-regular languages that cannot be violated by any finite prefix and thus aligns naturally with continuing interaction. We present the fi
Authors
(none)
Tags
Stats
Related papers
- Reward Design For Reinforcement Learning Agents (2025)0.00
- REBEL: Reward Regularization-based Approach For Robotic Reinforcement Learning From Human Feedback (2023)0.00
- Adaptive Reward Design For Reinforcement Learning (2024)0.00
- Examining Average And Discounted Reward Optimality Criteria In Reinforcement Learning (2021)0.00
- Designing Rewards For Fast Learning (2022)0.00
- Scalable Agent Alignment Via Reward Modeling: A Research Direction (2018)0.00
- ORSO: Accelerating Reward Design Via Online Reward Selection And Policy Optimization (2024)0.00
- Reward Models In Deep Reinforcement Learning: A Survey (2025)0.00