Speaker Separation Using Speaker Inventories And Estimated Speech
2020 Β· Peidong Wang, Zhuo Chen, Deliang Wang, et al.
Abstract
We propose speaker separation using speaker inventories and estimated speech (SSUSIES), a framework leveraging speaker profiles and estimated speech for speaker separation. SSUSIES contains two methods, speaker separation using speaker inventories (SSUSI) and speaker separation using estimated speech (SSUES). SSUSI performs speaker separation with the help of speaker inventory. By combining the advantages of permutation invariant training (PIT) and speech extraction, SSUSI significantly outperforms conventional approaches. SSUES is a widely applicable technique that can substantially improve speaker separation performance using the output of first-pass separation. We evaluate the models on both speaker separation and speech recognition metrics.
Authors
(none)
Tags
Stats
Related papers
- Continuous Speech Separation Using Speaker Inventory For Long Multi-talker Recording (2020)7.50
- Low-latency Speaker-independent Continuous Speech Separation (2019)9.23
- UNSSOR: Unsupervised Neural Speech Separation By Leveraging Over-determined Training Mixtures (2023)4.52
- New Insights On Target Speaker Extraction (2022)0.00
- Recursive Speech Separation For Unknown Number Of Speakers (2019)12.93
- Handling Trade-offs In Speech Separation With Sparsely-gated Mixture Of Experts (2022)0.00
- Sepit: Approaching A Single Channel Speech Separation Bound (2022)10.35
- Directed Speech Separation For Automatic Speech Recognition Of Long Form Conversational Speech (2021)2.26