Sepit: Approaching A Single Channel Speech Separation Bound
2022 Β· Shahar Lutati, Eliya Nachmani, Lior Wolf
Abstract
We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made significant progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state-of-the-art neural networks for 2, 3, 5, and 10 speakers.
Authors
(none)
Tags
Stats
Related papers
- Single-channel Speech Separation With Auxiliary Speaker Embeddings (2019)0.00
- Single-channel Multi-speaker Separation Using Deep Clustering (2016)0.00
- New Insights On Target Speaker Extraction (2022)0.00
- Efficient Integration Of Multi-channel Information For Speaker-independent Speech Separation (2020)0.00
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- End-to-end Multi-channel Speech Separation (2019)0.00
- Recursive Speech Separation For Unknown Number Of Speakers (2019)12.93
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00