Non-local Convolutional Neural Networks (nlcnn) For Speaker Recognition
2020 Β· Haici Yang, Hongda Mao, Ruirui Li, et al.
Abstract
Speaker recognition is the process of identifying a speaker based on the voice. The technology has attracted more attention with the recent increase in popularity of smart voice assistants, such as Amazon Alexa. In the past few years, various convolutional neural network (CNN) based speaker recognition algorithms have been proposed and achieved satisfactory performance. However, convolutional operations are building blocks that typically perform on a local neighborhood at a time and thus miss to capture global, long-range interactions at the feature level which are critical for understanding the pattern in a speaker's voice. In this work, we propose to apply Non-local Convolutional Neural Networks (NLCNN) to improve the capability of capturing long-range dependencies at the feature level, therefore improving speaker recognition performance. Specifically, we introduce non-local blocks where the output response of a position is computed as a weighted sum of the input features at all posi
Authors
(none)
Tags
Stats
Related papers
- Efficienttdnn: Efficient Architecture Search For Speaker Recognition (2021)10.07
- Neural Predictive Coding Using Convolutional Neural Networks Towards Unsupervised Learning Of Speaker Characteristics (2018)11.85
- Multi-speaker Localization Using Convolutional Neural Network Trained With Noise (2017)0.00
- Frequency And Temporal Convolutional Attention For Text-independent Speaker Recognition (2019)0.00
- Towards Speaker Identification With Minimal Dataset And Constrained Resources Using 1d-convolution Neural Network (2024)1.40
- Speakernet: 1D Depth-wise Separable Convolutional Network For Text-independent Speaker Recognition And Verification (2020)0.00
- Speaker Representation Learning Using Global Context Guided Channel And Time-frequency Transformations (2020)6.34
- Contextnet: Improving Convolutional Neural Networks For Automatic Speech Recognition With Global Context (2020)17.24