Few-shot Speaker Identification Using Lightweight Prototypical Network With Feature Grouping And Interaction
2023 Β· Yanxiong Li, Hao Chen, Wenchang Cao, et al.
Abstract
Existing methods for few-shot speaker identification (FSSI) obtain high accuracy, but their computational complexities and model sizes need to be reduced for lightweight applications. In this work, we propose a FSSI method using a lightweight prototypical network with the final goal to implement the FSSI on intelligent terminals with limited resources, such as smart watches and smart speakers. In the proposed prototypical network, an embedding module is designed to perform feature grouping for reducing the memory requirement and computational complexity, and feature interaction for enhancing the representational ability of the learned speaker embedding. In the proposed embedding module, audio feature of each speech sample is split into several low-dimensional feature subsets that are transformed by a recurrent convolutional block in parallel. Then, the operations of averaging, addition, concatenation, element-wise summation and statistics pooling are sequentially executed to learn a sp
Authors
(none)
Tags
Stats
Related papers
- Few-shot Speaker Identification Using Depthwise Separable Convolutional Network With Channel Attention (2022)5.24
- Few Shot Speaker Recognition Using Deep Neural Networks (2019)0.00
- Improving Speaker Identification For Shared Devices By Adapting Embeddings To Speaker Subsets (2021)4.52
- Small Footprint Text-independent Speaker Verification For Embedded Systems (2020)7.16
- Episodic Fine-tuning Prototypical Networks For Optimization-based Few-shot Learning: Application To Audio Classification (2024)2.26
- Speaker Fuzzy Fingerprints: Benchmarking Text-based Identification In Multiparty Dialogues (2025)0.00
- Towards Lightweight Speaker Verification Via Adaptive Neural Network Quantization (2024)5.84
- Fusion Of Embeddings Networks For Robust Combination Of Text Dependent And Independent Speaker Recognition (2021)4.52