Detection Of Glottal Closure Instants From Raw Speech Using Convolutional Neural Networks
2018 Β· Mohit Goyal, Varun Srivastava, Prathosh A. P
Abstract
Glottal Closure Instants (GCIs) correspond to the temporal locations of significant excitation to the vocal tract occurring during the production of voiced speech. GCI detection from speech signals is a well-studied problem given its importance in speech processing. Most of the existing approaches for GCI detection adopt a two-stage approach (i) Transformation of speech signal into a representative signal where GCIs are localized better, (ii) extraction of GCIs using the representative signal obtained in first stage. The former stage is accomplished using signal processing techniques based on the principles of speech production and the latter with heuristic-algorithms such as dynamic-programming and peak-picking. These methods are thus task-specific and rely on the methods used for representative signal extraction. However, in this paper, we formulate the GCI detection problem from a representation learning perspective where appropriate representation is implicitly learned from the raw
Authors
(none)
Tags
Stats
Related papers
- GCI Detection From Raw Speech Using A Fully-convolutional Network (2019)7.81
- Detection Of Glottal Closure Instants From Speech Signals: A Quantitative Review (2019)16.88
- Furcanet: An End-to-end Deep Gated Convolutional, Long Short-term Memory, Deep Neural Networks For Single Channel Speech Separation (2019)0.00
- Frame-based Overlapping Speech Detection Using Convolutional Neural Networks (2020)7.50
- Reconstructing Speech From Real-time Articulatory MRI Using Neural Vocoders (2021)0.00
- Investigating Deep Neural Structures And Their Interpretability In The Domain Of Voice Conversion (2021)0.00
- Glottal Source Estimation Robustness: A Comparison Of Sensitivity Of Voice Source Estimation Techniques (2020)0.00
- Formant Tracking Using Dilated Convolutional Networks Through Dense Connection With Gating Mechanism (2020)4.52