Unsupervised Multi-channel Separation And Adaptation
2023 Β· Cong Han, Kevin Wilson, Scott Wisdom, et al.
Abstract
A key challenge in machine learning is to generalize from training data to an application domain of interest. This work generalizes the recently-proposed mixture invariant training (MixIT) algorithm to perform unsupervised learning in the multi-channel setting. We use MixIT to train a model on far-field microphone array recordings of overlapping reverberant and noisy speech from the AMI Corpus. The models are trained on both supervised and unsupervised training data, and are tested on real AMI recordings containing overlapping speech. To objectively evaluate our models, we also use a synthetic multi-channel AMI test set. Holding network architectures constant, we find that a fine-tuned semi-supervised model yields the largest improvement to SI-SNR and to human listening ratings across synthetic and real datasets, outperforming supervised models trained on well-matched synthetic data. Our results demonstrate that unsupervised learning through MixIT enables model adaptation on both singl
Authors
(none)
Tags
Stats
Related papers
- Efficient Integration Of Multi-channel Information For Speaker-independent Speech Separation (2020)0.00
- Ac-mix: Self-supervised Adaptation For Low-resource Automatic Speech Recognition Using Agnostic Contrastive Mixup (2024)2.26
- Mixcycle: Unsupervised Speech Separation Via Cyclic Mixture Permutation Invariant Training (2022)6.34
- Teacher-student Mixit For Unsupervised And Semi-supervised Speech Separation (2021)9.03
- Mixup-breakdown: A Consistency Training Method For Improving Generalization Of Speech Separation Models (2019)0.00
- On Monoaural Speech Enhancement For Automatic Recognition Of Real Noisy Speech Using Mixture Invariant Training (2022)4.52
- A Highly Adaptive Acoustic Model For Accurate Multi-dialect Speech Recognition (2022)10.85
- Multi-channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation (2020)0.00