Pf-net: Personalized Filter For Speaker Recognition From Raw Waveform
2021 Β· Wencheng Li, Zhenhua Tan, Jingyu Ning, et al.
Abstract
Speaker recognition using i-vector has been replaced by speaker recognition using deep learning. Speaker recognition based on Convolutional Neural Networks (CNNs) has been widely used in recent years, which learn low-level speech representations from raw waveforms. On this basis, a CNN architecture called SincNet proposes a kind of unique convolutional layer, which has achieved band-pass filters. Compared with standard CNNs, SincNet learns the low and high cut-off frequencies of each filter. This paper proposes an improved CNNs architecture called PF-Net, which encourages the first convolutional layer to implement more personalized filters than SincNet. PF-Net parameterizes the frequency domain shape and can realize band-pass filters by learning some deformation points in frequency domain. Compared with standard CNN, PF-Net can learn the characteristics of each filter. Compared with SincNet, PF-Net can learn more characteristic parameters, instead of only low and high cut-off frequenci
Authors
(none)
Tags
Stats
Related papers
- Speaker Recognition From Raw Waveform With Sincnet (2018)20.65
- Speech And Speaker Recognition From Raw Waveform With Sincnet (2018)0.00
- Curricular Sincnet: Towards Robust Deep Speaker Recognition By Emphasizing Hard Samples In Latent Space (2021)4.52
- Speakernet: 1D Depth-wise Separable Convolutional Network For Text-independent Speaker Recognition And Verification (2020)0.00
- Detection Of Doctored Speech: Towards An End-to-end Parametric Learn-able Filter Approach (2022)0.00
- Deepvox: Discovering Features From Raw Audio For Speaker Recognition In Non-ideal Audio Signals (2020)0.00
- FDN: Finite Difference Network With Hierarchical Convolutional Features For Text-independent Speaker Verification (2021)0.00
- A Lightweight Dual-stage Framework For Personalized Speech Enhancement Based On Deepfilternet2 (2024)2.26