Efficient Integration Of Multi-channel Information For Speaker-independent Speech Separation
2020 Β· Yuichiro Koyama, Oluwafemi Azeez, Bhiksha Raj
Abstract
Although deep-learning-based methods have markedly improved the performance of speech separation over the past few years, it remains an open question how to integrate multi-channel signals for speech separation. We propose two methods, namely, early-fusion and late-fusion methods, to integrate multi-channel information based on the time-domain audio separation network, which has been proven effective in single-channel speech separation. We also propose channel-sequential-transfer learning, which is a transfer learning framework that applies the parameters trained for a lower-channel network as the initial values of a higher-channel network. For fair comparison, we evaluated our proposed methods using a spatialized version of the wsj0-2mix dataset, which is open-sourced. It was found that our proposed methods can outperform multi-channel deep clustering and improve the performance proportionally to the number of microphones. It was also proven that the performance of the late-fusion met
Authors
(none)
Tags
Stats
Related papers
- End-to-end Multi-channel Speech Separation (2019)0.00
- Spatial And Spectral Deep Attention Fusion For Multi-channel Speech Separation Using Deep Embedding Features (2020)0.00
- Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters (2023)10.35
- Multi-channel Narrow-band Deep Speech Separation With Full-band Permutation Invariant Training (2021)9.41
- Single-channel Multi-speaker Separation Using Deep Clustering (2016)0.00
- A Multi-stage Triple-path Method For Speech Separation In Noisy And Reverberant Environments (2023)2.26
- Audio-visual Speech Separation Based On Joint Feature Representation With Cross-modal Attention (2022)0.00
- Improved Speech Separation With Time-and-frequency Cross-domain Joint Embedding And Clustering (2019)10.74