Quantifying Statistical Significance Of Deep Nearest Neighbor Anomaly Detection Via Selective Inference

Abstract

In real-world applications, anomaly detection (AD) often operates without access to anomalous data, necessitating semi-supervised methods that rely solely on normal data. Among these methods, deep k-nearest neighbor (deep kNN) AD stands out for its interpretability and flexibility, leveraging distance-based scoring in deep latent spaces.Despite its strong performance, deep kNN lacks a mechanism to quantify uncertainty-an essential feature for critical applications such as industrial inspection. To address this limitation, we propose a statistical framework that quantifies the significance of detected anomalies in the form of p-values, thereby enabling control over false positive rates at a user-specified significance level (e.g.,0.05). A central challenge lies in managing selection bias, which we tackle using Selective Inference-a principled method for conducting inference conditioned on data-driven selections. We evaluate our method on diverse datasets and demonstrate that it provides

Quantifying Statistical Significance Of Deep Nearest Neighbor Anomaly Detection Via Selective Inference

Abstract

Authors

Tags

Stats

Related papers