Optimal Transport-based Adaptation In Dysarthric Speech Tasks
2021 Β· Rosanna Turrisi, Leonardo Badino
Abstract
In many real-world applications, the mismatch between distributions of training data (source) and test data (target) significantly degrades the performance of machine learning algorithms. In speech data, causes of this mismatch include different acoustic environments or speaker characteristics. In this paper, we address this issue in the challenging context of dysarthric speech, by multi-source domain/speaker adaptation (MSDA/MSSA). Specifically, we propose the use of an optimal-transport based approach, called MSDA via Weighted Joint Optimal Transport (MSDA-WDJOT). We confront the mismatch problem in dysarthria detection for which the proposed approach outperforms both the Baseline and the state-of-the-art MSDA models, improving the detection accuracy of 0.9% over the best competitor method. We then employ MSDA-WJDOT for dysarthric speaker adaptation in command speech recognition. This provides a Command Error Rate relative reduction of 16% and 7% over the baseline and the best compet
Authors
(none)
Tags
Stats
Related papers
- Interpretable Dysarthric Speaker Adaptation Based On Optimal-transport (2022)2.26
- Unsupervised Neural Adaptation Model Based On Optimal Transport For Spoken Language Identification (2020)8.82
- Channel Adaptation For Speaker Verification Using Optimal Transport With Pseudo Label (2024)0.00
- Unsupervised Noise Adaptive Speech Enhancement By Discriminator-constrained Optimal Transport (2021)0.00
- Speaker Adaptation Using Spectro-temporal Deep Features For Dysarthric And Elderly Speech Recognition (2022)12.02
- Enhancing Dysarthric Speech Recognition For Unseen Speakers Via Prototype-based Adaptation (2024)9.45
- Neural Domain Alignment For Spoken Language Recognition Based On Optimal Transport (2023)0.00
- On-the-fly Feature Based Rapid Speaker Adaptation For Dysarthric And Elderly Speech Recognition (2022)6.34