Abstract
This study proposed the integration of a laser Doppler vibrometer sensing with a Variational Inference with adversarial learning for Text-to-Speech-based voice conversion system to enhance automatic speech recognition for individuals with dysarthria in noisy environments. The proposed framework combines the noise robustness of laser Doppler vibrometer and generative modeling capabilities of Variational Inference with adversarial learning for Text-to-Speech to transform dysarthric speech into intelligible acoustic outputs. Experimental results demonstrated significant gains in automatic speech recognition accuracy compared with conventional acoustic methods, even at low signal-to-noise ratios. These findings establish a foundation for future clinical applications of augmentative and alternative communication systems.