Acoustic-to-articulatory Inversion Based On Speech Decomposition And Auxiliary Feature
2022 Β· Jianrong Wang, Jinyu Liu, Longxuan Zhao, et al.
Abstract
Acoustic-to-articulatory inversion (AAI) is to obtain the movement of articulators from speech signals. Until now, achieving a speaker-independent AAI remains a challenge given the limited data. Besides, most current works only use audio speech as input, causing an inevitable performance bottleneck. To solve these problems, firstly, we pre-train a speech decomposition network to decompose audio speech into speaker embedding and content embedding as the new personalized speech features to adapt to the speaker-independent case. Secondly, to further improve the AAI, we propose a novel auxiliary feature network to estimate the lip auxiliary features from the above personalized speech features. Experimental results on three public datasets show that, compared with the state-of-the-art only using the audio speech feature, the proposed method reduces the average RMSE by 0.25 and increases the average correlation coefficient by 2.0% in the speaker-dependent case. More importantly, the average
Authors
(none)
Tags
Stats
Related papers
- Speaker-independent Acoustic-to-articulatory Inversion Through Multi-channel Attention Discriminator (2024)0.00
- Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition (2022)8.09
- Speaker- And Text-independent Estimation Of Articulatory Movements And Phoneme Alignments From Speech (2024)2.26
- Independent And Automatic Evaluation Of Acoustic-to-articulatory Inversion Models (2019)0.00
- Audio Data Augmentation For Acoustic-to-articulatory Speech Inversion Using Bidirectional Gated Rnns (2022)0.00
- ARTI-6: Towards Six-dimensional Articulatory Speech Encoding (2025)0.00
- Articulatory-wavenet: Autoregressive Model For Acoustic-to-articulatory Inversion (2020)0.00
- A Study Of Incorporating Articulatory Movement Information In Speech Enhancement (2020)0.00