Interpretable Dysarthric Speaker Adaptation Based On Optimal-transport
2022 Β· Rosanna Turrisi, Leonardo Badino
Abstract
This work addresses the mismatch problem between the distribution of training data (source) and testing data (target), in the challenging context of dysarthric speech recognition. We focus on Speaker Adaptation (SA) in command speech recognition, where data from multiple sources (i.e., multiple speakers) are available. Specifically, we propose an unsupervised Multi-Source Domain Adaptation (MSDA) algorithm based on optimal-transport, called MSDA via Weighted Joint Optimal Transport (MSDA-WJDOT). We achieve a Command Error Rate relative reduction of 16% and 7% over the speaker-independent model and the best competitor method, respectively. The strength of the proposed approach is that, differently from any other existing SA method, it offers an interpretable model that can also be exploited, in this context, to diagnose dysarthria without any specific training. Indeed, it provides a closeness measure between the target and the source speakers, reflecting their similarity in terms of spe
Authors
(none)
Tags
Stats
Related papers
- Optimal Transport-based Adaptation In Dysarthric Speech Tasks (2021)0.00
- Channel Adaptation For Speaker Verification Using Optimal Transport With Pseudo Label (2024)0.00
- Enhancing Dysarthric Speech Recognition For Unseen Speakers Via Prototype-based Adaptation (2024)9.45
- Unsupervised Neural Adaptation Model Based On Optimal Transport For Spoken Language Identification (2020)8.82
- Speaker Identity Preservation In Dysarthric Speech Reconstruction By Adversarial Speaker Adaptation (2022)0.00
- Unsupervised Noise Adaptive Speech Enhancement By Discriminator-constrained Optimal Transport (2021)0.00
- On-the-fly Feature Based Rapid Speaker Adaptation For Dysarthric And Elderly Speech Recognition (2022)6.34
- Speaker Adaptation Using Spectro-temporal Deep Features For Dysarthric And Elderly Speech Recognition (2022)12.02