Phase Reconstruction From Amplitude Spectrograms Based On Von-mises-distribution Deep Neural Network
2018 Β· Shinnosuke Takamichi, Yuki Saito, Norihiro Takamune, et al.
Abstract
This paper presents a deep neural network (DNN)-based phase reconstruction from amplitude spectrograms. In audio signal and speech processing, the amplitude spectrogram is often used for processing, and the corresponding phase spectrogram is reconstructed from the amplitude spectrogram on the basis of the Griffin-Lim method. However, the Griffin-Lim method causes unnatural artifacts in synthetic speech. Addressing this problem, we introduce the von-Mises-distribution DNN for phase reconstruction. The DNN is a generative model having the von Mises distribution that can model distributions of a periodic variable such as a phase, and the model parameters of the DNN are estimated on the basis of the maximum likelihood criterion. Furthermore, we propose a group-delay loss for DNN training to make the predicted group delay close to a natural group delay. The experimental results demonstrate that 1) the trained DNN can predict group delay accurately more than phases themselves, and 2) our pha
Authors
(none)
Tags
Stats
Related papers
- Phase Reconstruction Based On Recurrent Phase Unwrapping With Deep Neural Networks (2020)9.59
- Deep Griffin-lim Iteration (2019)0.00
- PHASEN: A Phase-and-harmonics-aware Speech Enhancement Network (2019)18.20
- Generative Adversarial Network-based Approach To Signal Reconstruction From Magnitude Spectrograms (2018)10.97
- Deep Learning Based Phase Reconstruction For Speaker Separation: A Trigonometric Perspective (2018)13.34
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- Consistency-aware Multi-channel Speech Enhancement Using Deep Neural Networks (2020)0.00
- Neural Speech Phase Prediction Based On Parallel Estimation Architecture And Anti-wrapping Losses (2022)11.39