Membership Inference Attack Against Music Diffusion Models via Generative Manifold Perturbation

Yuxuan Liu·Peihong Zhang·Rui Sang·Zhixin Li·Yizhou Tan·Yiqiang Cai·Shengchen Li·2026

arXiv:2602.01645 ↗Google Scholar ↗Semantic Scholar ↗

Abstract

Membership inference attacks (MIAs) test whether a specific audio clip was used to train a model, making them a key tool for auditing generative music models for copyright compliance. However, loss-based signals (e.g., reconstruction error) are weakly aligned with human perception in practice, yielding poor separability at the low false-positive rates (FPRs) required for forensics. We propose the Latent Stability Adversarial Probe (LSA-Probe), a white-box method that measures a geometric property of the reverse diffusion: the minimal time-normalized perturbation budget needed to cross a fixed perceptual degradation threshold at an intermediate diffusion state. We show that training members, residing in more stable regions, exhibit a significantly higher degradation cost.

Abstract

Related papers