Utilizing Explainability Techniques For Reinforcement Learning Model Assurance
2023 Β· Alexander Tapley, Kyle Gatesman, Luis Robaina, et al.
Abstract
Explainable Reinforcement Learning (XRL) can provide transparency into the decision-making process of a Deep Reinforcement Learning (DRL) model and increase user trust and adoption in real-world use cases. By utilizing XRL techniques, researchers can identify potential vulnerabilities within a trained DRL model prior to deployment, therefore limiting the potential for mission failure or mistakes by the system. This paper introduces the ARLIN (Assured RL Model Interrogation) Toolkit, an open-source Python library that identifies potential vulnerabilities and critical points within trained DRL models through detailed, human-interpretable explainability outputs. To illustrate ARLIN's effectiveness, we provide explainability visualizations and vulnerability analysis for a publicly available DRL model. The open-source code repository is available for download at https://github.com/mitre/arlin.
Authors
(none)
Tags
Stats
Code
- mitre/arlinβ
Related papers
- Explainability In Deep Reinforcement Learning (2020)0.00
- Explainability In Deep Reinforcement Learning, A Review Into Current Methods And Applications (2022)12.33
- A Survey On Explainable Reinforcement Learning: Concepts, Algorithms, Challenges (2022)0.00
- Explainable Artificial Intelligence (XAI) For Increasing User Trust In Deep Reinforcement Learning Driven Autonomous Systems (2021)0.00
- Domain-level Explainability -- A Challenge For Creating Trust In Superhuman AI Strategies (2020)0.00
- Xrl-bench: A Benchmark For Evaluating And Comparing Explainable Reinforcement Learning Techniques (2024)0.00
- A Survey Of Explainable Reinforcement Learning: Targets, Methods And Needs (2025)0.00
- Explainable Reinforcement Learning: A Survey (2020)0.00