Advancing The Dimensionality Reduction Of Speaker Embeddings For Speaker Diarisation: Disentangling Noise And Informing Speech Activity
2021 Β· You Jin Kim, Hee-Soo Heo, Jee-Weon Jung, et al.
Abstract
The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. Speaker embeddings play a crucial role in the performance of diarisation systems, but they often capture spurious information such as noise, adversely affecting performance. Our previous work has proposed an auto-encoder-based dimensionality reduction module to help remove the redundant information. However, they do not explicitly separate such information and have also been found to be sensitive to hyper-parameter values. To this end, we propose two contributions to overcome these issues: (i) a novel dimensionality reduction framework that can disentangle spurious information from the speaker embeddings; (ii) the use of speech activity vector to prevent the speaker code from representing the background noise. Through a range of experiments conducted on four datasets, our approach consistently demonstrates the state-of-the-art performance among models without system fusion.
Authors
(none)
Tags
Stats
Related papers
- Leveraging Speaker Embeddings In End-to-end Neural Diarization For Two-speaker Scenarios (2024)0.00
- Speaker Diarization Using Deep Recurrent Convolutional Neural Networks For Speaker Embeddings (2017)9.41
- Intra-class Variation Reduction Of Speaker Representation In Disentanglement Framework (2020)8.35
- Combination Of Deep Speaker Embeddings For Diarisation (2020)8.60
- Improved Large-margin Softmax Loss For Speaker Diarisation (2019)6.34
- Multi-scale Speaker Embedding-based Graph Attention Networks For Speaker Diarisation (2021)8.35
- Look Who's Not Talking (2020)0.00
- Exploring Speaker-related Information In Spoken Language Understanding For Better Speaker Diarization (2023)0.00