Complex Frequency Domain Linear Prediction: A Tool To Compute Modulation Spectrum Of Speech
2022 Β· Samik Sadhu, Hynek Hermansky
Abstract
Conventional Frequency Domain Linear Prediction (FDLP) technique models the squared Hilbert envelope of speech with varied degrees of approximation which can be sampled at the required frame rate and used as features for Automatic Speech Recognition (ASR). Although previously the complex cepstrum of the conventional FDLP model has been used as compact frame-wise speech features, it has lacked interpretability in the context of the Hilbert envelope. In this paper, we propose a modification of the conventional FDLP model that allows easy interpretability of the complex cepstrum as temporal modulations in an all-pole model approximation of the power of the speech signal. Additionally, our "complex" FDLP yields significant speed-ups in comparison to conventional FDLP for the same degree of approximation.
Authors
(none)
Tags
Stats
Related papers
- Forknet: Simultaneous Time And Time-frequency Domain Modeling For Speech Enhancement (2023)0.00
- Improved Frequency Modulation Features For Multichannel Distant Speech Recognition (2018)6.77
- A Robust Frame-based Nonlinear Prediction System For Automatic Speech Coding (2016)0.00
- Long-frame-shift Neural Speech Phase Prediction With Spectral Continuity Enhancement And Interpolation Error Compensation (2023)0.00
- Deep Factorization For Speech Signal (2018)8.82
- Complex-valued Restricted Boltzmann Machine For Direct Speech Parameterization From Complex Spectra (2018)5.24
- Noisy Speech Based Temporal Decomposition To Improve Fundamental Frequency Estimation (2021)5.24
- Phase-aware Single-channel Speech Enhancement With Modulation-domain Kalman Filtering (2017)0.00