Parameterized Channel Normalization For Far-field Deep Speaker Verification
2021 Β· Xuechen Liu, Md Sahidullah, Tomi Kinnunen
Abstract
We address far-field speaker verification with deep neural network (DNN) based speaker embedding extractor, where mismatch between enrollment and test data often comes from convolutive effects (e.g. room reverberation) and noise. To mitigate these effects, we focus on two parametric normalization methods: per-channel energy normalization (PCEN) and parameterized cepstral mean normalization (PCMN). Both methods contain differentiable parameters and thus can be conveniently integrated to, and jointly optimized with the DNN using automatic differentiation methods. We consider both fixed and trainable (data-driven) variants of each method. We evaluate the performance on Hi-MIA, a recent large-scale far-field speech corpus, with varied microphone and positional settings. Our methods outperform conventional mel filterbank features, with maximum of 33.5% and 39.5% relative improvement on equal error rate under matched microphone and mismatched microphone conditions, respectively.
Authors
(none)
Tags
Stats
Related papers
- Optimized Power Normalized Cepstral Coefficients Towards Robust Deep Speaker Verification (2021)4.52
- How To Leverage Dnn-based Speech Enhancement For Multi-channel Speaker Verification? (2022)0.00
- Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild (2020)0.00
- The HCCL Speaker Verification System For Far-field Speaker Verification Challenge (2021)0.00
- Investigation Of Different Calibration Methods For Deep Speaker Embedding Based Verification Systems (2022)0.00
- Feature Enhancement With Deep Feature Losses For Speaker Verification (2019)10.61
- NPU Speaker Verification System For INTERSPEECH 2020 Far-field Speaker Verification Challenge (2020)7.50
- Deep Speaker Embeddings For Far-field Speaker Recognition On Short Utterances (2020)11.29