Dynamic Kernels And Channel Attention For Low Resource Speaker Verification
2022 Β· Anna Ollerenshaw, Md Asif Jalal, Thomas Hain
Abstract
State-of-the-art speaker verification frameworks have typically focused on developing models with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper proposes an approach to increase the model resolution capability using attention-based dynamic kernels in a convolutional neural network to adapt the model parameters to be feature-conditioned. The attention weights on the kernels are further distilled by channel attention and multi-layer feature aggregation to learn global features from speech. This approach provides an efficient solution to improving representation capacity with lower data resources. This is due to the self-adaptation to inputs of the structures of the model parameters. The proposed dynamic convolutional model achieved 1.62% EER and 0.18 miniDCF on the VoxCeleb1 test set and has a 17% relative improvement compared to the ECAPA-TDNN using the same training resources.
Authors
(none)
Tags
Stats
Related papers
- Frequency And Multi-scale Selective Kernel Attention For Speaker Verification (2022)10.07
- Temporal Dynamic Convolutional Neural Network For Text-independent Speaker Verification And Phonemetic Analysis (2021)11.19
- ECAPA-TDNN: Emphasized Channel Attention, Propagation And Aggregation In TDNN Based Speaker Verification (2020)23.07
- Decomposed Temporal Dynamic CNN: Efficient Time-adaptive Network For Text-independent Speaker Verification Explained With Speaker Activation Map (2022)0.00
- Convolution-based Channel-frequency Attention For Text-independent Speaker Verification (2022)7.50
- Multi-frequency Information Enhanced Channel Attention Module For Speaker Representation Learning (2022)0.00
- Optimization Of Dnn-based Speaker Verification Model Through Efficient Quantization Technique (2024)0.00
- Self-attentive Multi-layer Aggregation With Feature Recalibration And Normalization For End-to-end Speaker Verification System (2020)0.00