Few-shot Speaker Identification Using Depthwise Separable Convolutional Network With Channel Attention
2022 Β· Yanxiong Li, Wucheng Wang, Hao Chen, et al.
Abstract
Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification. In the task of few-shot learning, overfitting is a tough problem mainly due to the mismatch between training and testing conditions. In this paper, we propose a few-shot speaker identification method which can alleviate the overfitting problem. In the proposed method, the model of a depthwise separable convolutional network with channel attention is trained with a prototypical loss function. Experimental datasets are extracted from three public speech corpora: Aishell-2, VoxCeleb1 and TORGO. Experimental results show that the proposed method exceeds state-of-the-art methods for few-shot speaker identification in terms of accuracy and F-score.
Authors
(none)
Tags
Stats
Related papers
- Few Shot Speaker Recognition Using Deep Neural Networks (2019)0.00
- Few-shot Speaker Identification Using Lightweight Prototypical Network With Feature Grouping And Interaction (2023)9.03
- On The Transferability Of Large-scale Self-supervision To Few-shot Audio Classification (2024)3.58
- Fully Few-shot Class-incremental Audio Classification Using Expandable Dual-embedding Extractor (2024)6.21
- Weakly Supervised Training Of Speaker Identification Models (2018)5.84
- Towards Speaker Identification With Minimal Dataset And Constrained Resources Using 1d-convolution Neural Network (2024)1.40
- Improving Robustness Of One-shot Voice Conversion With Deep Discriminative Speaker Encoder (2021)5.84
- Weakly Supervised Training Of Hierarchical Attention Networks For Speaker Identification (2020)3.58