STC Speaker Recognition Systems For The Voices From A Distance Challenge
2019 Β· Sergey Novoselov, Aleksei Gusev, Artem Ivanov, et al.
Abstract
This paper presents the Speech Technology Center (STC) speaker recognition (SR) systems submitted to the VOiCES From a Distance challenge 2019. The challenge's SR task is focused on the problem of speaker recognition in single channel distant/far-field audio under noisy conditions. In this work we investigate different deep neural networks architectures for speaker embedding extraction to solve the task. We show that deep networks with residual frame level connections outperform more shallow architectures. Simple energy based speech activity detector (SAD) and automatic speech recognition (ASR) based SAD are investigated in this work. We also address the problem of data preparation for robust embedding extractors training. The reverberation for the data augmentation was performed using automatic room impulse response generator. In our systems we used discriminatively trained cosine similarity metric learning model as embedding backend. Scores normalization procedure was applied for eac
Authors
(none)
Tags
Stats
Related papers
- Deep Speaker Embeddings For Far-field Speaker Recognition On Short Utterances (2020)11.29
- The HCCL Speaker Verification System For Far-field Speaker Verification Challenge (2021)0.00
- The Voices From A Distance Challenge 2019 Evaluation Plan (2019)0.00
- A Network Of Deep Neural Networks For Distant Speech Recognition (2017)10.35
- Length- And Noise-aware Training Techniques For Short-utterance Speaker Recognition (2020)0.00
- NPU Speaker Verification System For INTERSPEECH 2020 Far-field Speaker Verification Challenge (2020)7.50
- UTD-CRSS Systems For 2016 NIST Speaker Recognition Evaluation (2016)0.00
- The SVASR System For Text-dependent Speaker Verification (tdsv) AAIC Challenge 2024 (2024)0.00