Cross-domain Voice Activity Detection With Self-supervised Representations
2022 Β· Sina Alisamir, Fabien Ringeval, Francois Portet
Abstract
Voice Activity Detection (VAD) aims at detecting speech segments on an audio signal, which is a necessary first step for many today's speech based applications. Current state-of-the-art methods focus on training a neural network exploiting features directly contained in the acoustics, such as Mel Filter Banks (MFBs). Such methods therefore require an extra normalisation step to adapt to a new domain where the acoustics is impacted, which can be simply due to a change of speaker, microphone, or environment. In addition, this normalisation step is usually a rather rudimentary method that has certain limitations, such as being highly susceptible to the amount of data available for the new domain. Here, we exploited the crowd-sourced Common Voice (CV) corpus to show that representations based on Self-Supervised Learning (SSL) can adapt well to different domains, because they are computed with contextualised representations of speech across multiple domains. SSL representations also achieve
Authors
(none)
Tags
Stats
Related papers
- Adversarial Speaker Disentanglement Using Unannotated External Data For Self-supervised Representation Based Voice Conversion (2023)6.34
- Automatic Data Augmentation For Domain Adapted Fine-tuning Of Self-supervised Speech Representations (2023)0.00
- Noise-robust Target-speaker Voice Activity Detection Through Self-supervised Pretraining (2025)0.00
- End-to-end Integration Of Speech Emotion Recognition With Voice Activity Detection Using Self-supervised Learning Features (2024)0.00
- Boosting Cross-domain Speech Recognition With Self-supervision (2022)0.00
- Self-supervised Learning Based Domain Adaptation For Robust Speaker Verification (2021)11.49
- Multi-domain Adaptation By Self-supervised Learning For Speaker Verification (2023)0.00
- Self-adaptive Soft Voice Activity Detection Using Deep Neural Networks For Robust Speaker Verification (2019)6.77