End-to-end Lpcnet: A Neural Vocoder With Fully-differentiable LPC Estimation
2022 Β· Krishna Subramani, Jean-Marc Valin, Umut Isik, et al.
Abstract
Neural vocoders have recently demonstrated high quality speech synthesis, but typically require a high computational complexity. LPCNet was proposed as a way to reduce the complexity of neural synthesis by using linear prediction (LP) to assist an autoregressive model. At inference time, LPCNet relies on the LP coefficients being explicitly computed from the input acoustic features. That makes the design of LPCNet-based systems more complicated, while adding the constraint that the input features must represent a clean speech spectrum. We propose an end-to-end version of LPCNet that lifts these limitations by learning to infer the LP coefficients from the input features in the frame rate network. Results show that the proposed end-to-end approach equals or exceeds the quality of the original LPCNet model, but without explicit LP analysis. Our open-source end-to-end model still benefits from LPCNet's low complexity, while allowing for any type of conditioning features.
Authors
(none)
Tags
Stats
Related papers
- A Real-time Wideband Neural Vocoder At 1.6 Kb/s Using Lpcnet (2019)12.61
- Lpcnet: Improving Neural Speech Synthesis Through Linear Prediction (2018)0.00
- Improving Lpcnet-based Text-to-speech With Linear Prediction-structured Mixture Density Network (2020)5.24
- Neural Speech Synthesis On A Shoestring: Improving The Efficiency Of Lpcnet (2022)5.84
- Lp-wavenet: Linear Prediction-based Wavenet Speech Synthesis (2018)0.00
- High Quality, Lightweight And Adaptable TTS Using Lpcnet (2019)10.97
- Controllable Sequence-to-sequence Neural TTS With LPCNET Backend For Real-time Speech Synthesis On CPU (2020)0.00
- Featherwave: An Efficient High-fidelity Neural Vocoder With Multi-band Linear Prediction (2020)8.35