A Method To Reveal Speaker Identity In Distributed ASR Training, And How To Counter It
2021 Β· Trung Dang, Om Thakkar, Swaroop Ramaswamy, et al.
Abstract
End-to-end Automatic Speech Recognition (ASR) models are commonly trained over spoken utterances using optimization methods like Stochastic Gradient Descent (SGD). In distributed settings like Federated Learning, model training requires transmission of gradients over a network. In this work, we design the first method for revealing the identity of the speaker of a training utterance with access only to a gradient. We propose Hessian-Free Gradients Matching, an input reconstruction technique that operates without second derivatives of the loss function (required in prior works), which can be expensive to compute. We show the effectiveness of our method using the DeepSpeech model architecture, demonstrating that it is possible to reveal the speaker's identity with 34% top-1 accuracy (51% top-5 accuracy) on the LibriSpeech dataset. Further, we study the effect of two well-known techniques, Differentially Private SGD and Dropout, on the success of our method. We show that a dropout rate of
Authors
(none)
Tags
Stats
Related papers
- On-device Speaker Anonymization Of Acoustic Embeddings For ASR Based Onflexible Location Gradient Reversal Layer (2023)0.00
- Distributed Training Of Deep Neural Network Acoustic Models For Automatic Speech Recognition (2020)0.00
- Enabling Differentially Private Federated Learning For Speech Recognition: Benchmarks, Adaptive Optimizers And Gradient Clipping (2023)2.56
- Ghostvec: A New Threat To Speaker Privacy Of End-to-end Speech Recognition System (2023)0.00
- Privacy Attacks For Automatic Speech Recognition Acoustic Models In A Federated Learning Framework (2021)9.23
- To Reverse The Gradient Or Not: An Empirical Comparison Of Adversarial And Multi-task Learning In Speech Recognition (2018)9.59
- Speaker Identity Preservation In Dysarthric Speech Reconstruction By Adversarial Speaker Adaptation (2022)0.00
- Reprogramming Self-supervised Learning-based Speech Representations For Speaker Anonymization (2023)2.26