Speaker Verification Using Attentive Multi-scale Convolutional Recurrent Network
2023 Β· Yanxiong Li, Zhongjie Jiang, Wenchang Cao, et al.
Abstract
In this paper, we propose a speaker verification method by an Attentive Multi-scale Convolutional Recurrent Network (AMCRN). The proposed AMCRN can acquire both local spatial information and global sequential information from the input speech recordings. In the proposed method, logarithm Mel spectrum is extracted from each speech recording and then fed to the proposed AMCRN for learning speaker embedding. Afterwards, the learned speaker embedding is fed to the back-end classifier (such as cosine similarity metric) for scoring in the testing stage. The proposed method is compared with state-of-the-art methods for speaker verification. Experimental data are three public datasets that are selected from two large-scale speech corpora (VoxCeleb1 and VoxCeleb2). Experimental results show that our method exceeds baseline methods in terms of equal error rate and minimal detection cost function, and has advantages over most of baseline methods in terms of computational complexity and memory req
Authors
(none)
Tags
Stats
Related papers
- Aca-net: Towards Lightweight Speaker Verification Using Asymmetric Cross Attention (2023)0.00
- Simple Attention Module Based Speaker Verification With Iterative Noisy Label Detection (2021)13.23
- Frequency And Multi-scale Selective Kernel Attention For Speaker Verification (2022)10.07
- Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays (2023)0.00
- Attention Back-end For Automatic Speaker Verification With Multiple Enrollment Utterances (2021)10.21
- Multi-task Network For Noise-robust Keyword Spotting And Speaker Verification Using Ctc-based Soft VAD And Global Query Attention (2020)9.41
- Double Multi-head Attention For Speaker Verification (2020)8.09
- Saladnet: Self-attentive Multisource Localization In The Ambisonics Domain (2021)7.50