Beam-guided Tasnet: An Iterative Speech Separation Framework With Multi-channel Output
2021 Β· Hangting Chen, Yang Yi, Dang Feng, et al.
Abstract
Time-domain audio separation network (TasNet) has achieved remarkable performance in blind source separation (BSS). Classic multi-channel speech processing framework employs signal estimation and beamforming. For example, Beam-TasNet links multi-channel convolutional TasNet (MC-Conv-TasNet) with minimum variance distortionless response (MVDR) beamforming, which leverages the strong modeling ability of data-driven network and boosts the performance of beamforming with an accurate estimation of speech statistics. Such integration can be viewed as a directed acyclic graph by accepting multi-channel input and generating multi-source output. In this paper, we design a "multi-channel input, multi-channel multi-source output" (MIMMO) speech separation system entitled "Beam-Guided TasNet", where MC-Conv-TasNet and MVDR can interact and promote each other more compactly under a directed cyclic flow. Specifically, the first stage uses Beam-TasNet to generate estimated single-speaker signals, whi
Authors
(none)
Tags
Stats
Related papers
- Tasnet: Time-domain Audio Separation Network For Real-time, Single-channel Speech Separation (2017)20.16
- Demystifying Tasnet: A Dissecting Approach (2019)12.10
- Speech Separation Based On Multi-stage Elaborated Dual-path Deep Bilstm With Auxiliary Identity Loss (2020)9.77
- Conv-tasnet: Surpassing Ideal Time-frequency Magnitude Masking For Speech Separation (2018)24.08
- Mimo-dbnet: Multi-channel Input And Multiple Outputs Doa-aware Beamforming Network For Speech Separation (2022)0.00
- Multi-scale Feature Fusion Transformer Network For End-to-end Single Channel Speech Separation (2022)0.00
- Time Domain Audio Visual Speech Separation (2019)14.62
- Locate And Beamform: Two-dimensional Locating All-neural Beamformer For Multi-channel Speech Separation (2023)3.58