Mixup-breakdown: A Consistency Training Method For Improving Generalization Of Speech Separation Models
2019 Β· Max W. Y. Lam, Jun Wang, Dan Su, et al.
Abstract
Deep-learning based speech separation models confront poor generalization problem that even the state-of-the-art models could abruptly fail when evaluating them in mismatch conditions. To address this problem, we propose an easy-to-implement yet effective consistency based semi-supervised learning (SSL) approach, namely Mixup-Breakdown training (MBT). It learns a teacher model to "breakdown" unlabeled inputs, and the estimated separations are interpolated to produce more useful pseudo "mixup" input-output pairs, on which the consistency regularization could apply for learning a student model. In our experiment, we evaluate MBT under various conditions with ascending degrees of mismatch, including unseen interfering speech, noise, and music, and compare MBT's generalization capability against state-of-the-art supervised learning and SSL approaches. The result indicates that MBT significantly outperforms several strong baselines with up to 13.77% relative SI-SNRi improvement. Moreover, M
Authors
(none)
Tags
Stats
Related papers
- Heterogeneous Separation Consistency Training For Adaptation Of Unsupervised Speech Separation (2022)5.24
- Single-channel Speech Enhancement Using Learnable Loss Mixup (2023)0.00
- Teacher-student Mixit For Unsupervised And Semi-supervised Speech Separation (2021)9.03
- Mcr-data2vec 2.0: Improving Self-supervised Speech Pre-training Via Model-level Consistency Regularization (2023)3.58
- Investigating Self-supervised Learning For Speech Enhancement And Separation (2022)13.44
- Remix-cycle-consistent Learning On Adversarially Learned Separator For Accurate And Stable Unsupervised Speech Separation (2022)3.58
- Unsupervised Multi-channel Separation And Adaptation (2023)4.52
- Multi-variant Consistency Based Self-supervised Learning For Robust Automatic Speech Recognition (2021)0.00