A Multiscale Autoencoder (MSAE) Framework For End-to-end Neural Network Speech Enhancement
2023 Β· Bengt J. Borgstrom, Michael S. Brandstein
Abstract
Neural network approaches to single-channel speech enhancement have received much recent attention. In particular, mask-based architectures have achieved significant performance improvements over conventional methods. This paper proposes a multiscale autoencoder (MSAE) for mask-based end-to-end neural network speech enhancement. The MSAE performs spectral decomposition of an input waveform within separate band-limited branches, each operating with a different rate and scale, to extract a sequence of multiscale embeddings. The proposed framework features intuitive parameterization of the autoencoder, including a flexible spectral band design based on the Constant-Q transform. Additionally, the MSAE is constructed entirely of differentiable operators, allowing it to be implemented within an end-to-end neural network, and be discriminatively trained. The MSAE draws motivation both from recent multiscale network topologies and from traditional multiresolution transforms in speech processin
Authors
(none)
Tags
Stats
Related papers
- Masked Autoencoders With Multi-window Local-global Attention Are Better Audio Learners (2023)0.00
- Mp-senet: A Speech Enhancement Model With Parallel Denoising Of Magnitude And Phase Spectra (2023)15.51
- Leveraging Joint Spectral And Spatial Learning With MAMBA For Multichannel Speech Enhancement (2024)0.00
- Semi-supervised Multichannel Speech Enhancement With Variational Autoencoders And Non-negative Matrix Factorization (2018)12.25
- Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement (2022)0.00
- Noise Classification Aided Attention-based Neural Network For Monaural Speech Enhancement (2021)0.00
- Ednet: A Versatile Speech Enhancement Framework With Gating Mamba Mechanism And Phase Shift-invariant Training (2025)0.00
- Incorporating Multi-target In Multi-stage Speech Enhancement Model For Better Generalization (2021)0.00