Variational Auto-encoder Based Variability Encoding For Dysarthric Speech Recognition
2022 Β· Xurong Xie, Rukiye Ruzi, Xunying Liu, et al.
Abstract
Dysarthric speech recognition is a challenging task due to acoustic variability and limited amount of available data. Diverse conditions of dysarthric speakers account for the acoustic variability, which make the variability difficult to be modeled precisely. This paper presents a variational auto-encoder based variability encoder (VAEVE) to explicitly encode such variability for dysarthric speech. The VAEVE makes use of both phoneme information and low-dimensional latent variable to reconstruct the input acoustic features, thereby the latent variable is forced to encode the phoneme-independent variability. Stochastic gradient variational Bayes algorithm is applied to model the distribution for generating variability encodings, which are further used as auxiliary features for DNN acoustic modeling. Experiment results conducted on the UASpeech corpus show that the VAEVE based variability encodings have complementary effect to the learning hidden unit contributions (LHUC) speaker adaptat
Authors
(none)
Tags
Stats
Related papers
- Weak-supervised Dysarthria-invariant Features For Spoken Language Understanding Using An FHVAE And Adversarial Training (2022)2.26
- Adversarial Data Augmentation Using VAE-GAN For Disordered Speech Recognition (2022)0.00
- Conditional Deep Hierarchical Variational Autoencoder For Voice Conversion (2021)0.00
- Learning Robust Speech Representation With An Articulatory-regularized Variational Autoencoder (2021)3.58
- Unsupervised Speech Enhancement Using Dynamical Variational Auto-encoders (2021)13.28
- A Statistically Principled And Computationally Efficient Approach To Speech Enhancement Using Variational Autoencoders (2019)9.23
- Unsupervised Representation Learning Of Speech For Dialect Identification (2018)7.16
- A Benchmark Of Dynamical Variational Autoencoders Applied To Speech Spectrogram Modeling (2021)6.77