Fairness In Reinforcement Learning
2016 Β· Shahin Jabbari, Matthew Joseph, Michael Kearns, et al.
Abstract
We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness
Authors
(none)
Tags
Stats
Related papers
- What Hides Behind Unfairness? Exploring Dynamics Fairness In Reinforcement Learning (2024)0.95
- Achieving Fairness In Multi-agent Markov Decision Processes Using Reinforcement Learning (2023)0.00
- Learning Fair Policies In Multiobjective (deep) Reinforcement Learning With Average And Discounted Rewards (2020)0.00
- Striking A Balance In Fairness For Dynamic Systems Through Reinforcement Learning (2024)2.26
- Socially Fair Reinforcement Learning (2022)0.00
- Past-discounting Is Key For Learning Markovian Fairness With Long Horizons (2025)0.00
- Counterfactually Fair Reinforcement Learning Via Sequential Data Preprocessing (2025)0.00
- [re] Fairdice: A Gap Between Theory And Practice (2026)0.00