Deep Neural Network Techniques For Monaural Speech Enhancement: State Of The Art Analysis
2022 Β· Peter Ochieng
Abstract
Deep neural networks (DNN) techniques have become pervasive in domains such as natural language processing and computer vision. They have achieved great success in these domains in task such as machine translation and image generation. Due to their success, these data driven techniques have been applied in audio domain. More specifically, DNN models have been applied in speech enhancement domain to achieve denosing, dereverberation and multi-speaker separation in monaural speech enhancement. In this paper, we review some dominant DNN techniques being employed to achieve speech separation. The review looks at the whole pipeline of speech enhancement from feature extraction, how DNN based tools are modelling both global and local features of speech and model training (supervised and unsupervised). We also review the use of speech-enhancement pre-trained models to boost speech enhancement process. The review is geared towards covering the dominant trends with regards to DNN application in
Authors
(none)
Tags
Stats
Related papers
- Multi-modal Hybrid Deep Neural Network For Speech Enhancement (2016)0.00
- On The Role Of Spatial, Spectral, And Temporal Processing For Dnn-based Non-linear Multi-channel Speech Enhancement (2022)7.81
- Convolutive Prediction For Monaural Speech Dereverberation And Noisy-reverberant Speaker Separation (2021)11.39
- Rethinking Complex-valued Deep Neural Networks For Monaural Speech Enhancement (2023)6.77
- Insights Into Deep Non-linear Filters For Improved Multi-channel Speech Enhancement (2022)13.93
- Consistency-aware Multi-channel Speech Enhancement Using Deep Neural Networks (2020)0.00
- How To Leverage Dnn-based Speech Enhancement For Multi-channel Speaker Verification? (2022)0.00
- An Overview Of Deep-learning-based Audio-visual Speech Enhancement And Separation (2020)18.31