Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays
2023 Β· Yijiang Chen, Chengdong Liang, Xiao-Lei Zhang
Abstract
The performance of speaker verification degrades significantly in adverse acoustic environments with strong reverberation and noise. To address this issue, this paper proposes a spatial-temporal graph convolutional network (GCN) method for the multi-channel speaker verification with ad-hoc microphone arrays. It includes a feature aggregation block and a channel selection block, both of which are built on graphs. The feature aggregation block fuses speaker features among different time and channels by a spatial-temporal GCN. The graph-based channel selection block discards the noisy channels that may contribute negatively to the system. The proposed method is flexible in incorporating various kinds of graphs and prior knowledge. We compared the proposed method with six representative methods in both real-world and simulated environments. Experimental results show that the proposed method achieves a relative equal error rate (EER) reduction of \(\mathbf\{15.39%\}\) lower than the stron
Authors
(none)
Tags
Stats
Related papers
- Multi-channel Speaker Verification For Single And Multi-talker Speech (2020)0.00
- Graph Attention Networks For Speaker Verification (2020)9.23
- Speaker Verification Using Attentive Multi-scale Convolutional Recurrent Network (2023)0.00
- Multi-task Network For Noise-robust Keyword Spotting And Speaker Verification Using Ctc-based Soft VAD And Global Query Attention (2020)9.41
- How To Leverage Dnn-based Speech Enhancement For Multi-channel Speaker Verification? (2022)0.00
- Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild (2020)0.00
- Multi-stream Convolutional Neural Network With Frequency Selection For Robust Speaker Verification (2020)3.58
- Frequency And Multi-scale Selective Kernel Attention For Speaker Verification (2022)10.07