Speech Enhancement
50 papers tagged Speech Enhancement (ordered by heat_score)
Papers
- Hubert: Self-supervised Speech Representation Learning By Masked Prediction Of Hidden Units (2021)Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, et al.25.30
- Speech Enhancement And Dereverberation With Diffusion-based Generative Models (2022)Julius Richter, Simon Welker, Jean-Marie Lemercier, et al.23.51
- FRCRN: Boosting Feature Representation Using Frequency Recurrence For Monaural Speech Enhancement (2022)Shengkui Zhao, Bin Ma, Karn N. Watcharasupat, et al.22.16
- SEGAN: Speech Enhancement Generative Adversarial Network (2017)Santiago Pascual, Antonio Bonafonte, Joan Serrà21.85
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)Yanxin Hu, Yun Liu, Shubo Lv, et al.20.78
- VQMIVC: Vector Quantization And Mutual Information-based Unsupervised Speech Representation Disentanglement For One-shot Voice Conversion (2021)Disong Wang, Liqun Deng, Yu Ting Yeung, et al.20.31
- Supervised And Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization (2017)Nasser Mohammadiha, Paris Smaragdis, Arne Leijon18.80
- Espnet-se++: Speech Enhancement For Robust Speech Recognition, Translation, And Understanding (2022)Yen-Ju Lu, Xuankai Chang, Chenda Li, et al.18.72
- An Overview Of Deep-learning-based Audio-visual Speech Enhancement And Separation (2020)Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, et al.18.31
- The 2020 Espnet Update: New Features, Broadened Applications, Performance Improvements, And Future Plans (2020)Shinji Watanabe, Florian Boyer, Xuankai Chang, et al.18.20
- PHASEN: A Phase-and-harmonics-aware Speech Enhancement Network (2019)Dacheng Yin, Chong Luo, Zhiwei Xiong, et al.18.20
- End-to-end Waveform Utterance Enhancement For Direct Evaluation Metrics Optimization By Fully Convolutional Neural Networks (2017)Szu-Wei Fu, Tao-Wei Wang, Yu Tsao, et al.18.00
- Audio-visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks (2017)Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, et al.17.39
- An Unsupervised Autoregressive Model For Speech Representation Learning (2019)Yu-An Chung, Wei-Ning Hsu, Hao Tang, et al.17.26
- Mockingjay: Unsupervised Speech Representation Learning With Deep Bidirectional Transformer Encoders (2019)Andy T. Liu, Shu-Wen Yang, Po-Han Chi, et al.17.26
- Fullsubnet: A Full-band And Sub-band Fusion Model For Real-time Single-channel Speech Enhancement (2020)Xiang Hao, Xiangdong Su, Radu Horaud, et al.17.09
- Layer-wise Analysis Of A Self-supervised Speech Representation Model (2021)Ankita Pasad, Ju-Chieh Chou, Karen Livescu17.07
- TSTNN: Two-stage Transformer Based Neural Network For Speech Enhancement In The Time Domain (2021)Kai Wang, Bengbeng He, Wei-Ping Zhu16.73
- Raw Waveform-based Speech Enhancement By Fully Convolutional Networks (2017)Szu-Wei Fu, Yu Tsao, Xugang Lu, et al.16.63
- Dense CNN With Self-attention For Time-domain Speech Enhancement (2020)Ashutosh Pandey, Deliang Wang16.59
- Speech Emotion Recognition With Global-aware Fusion On Multi-scale Feature Representation (2022)Wenjing Zhu, Xiang Li16.53
- Speech Resynthesis From Discrete Disentangled Self-supervised Representations (2021)Adam Polyak, Yossi Adi, Jade Copet, et al.16.25
- Speech Emotion Recognition With Co-attention Based Multi-level Acoustic Information (2022)Heqing Zou, Yuke Si, Chen Chen, et al.16.17
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)Chris Donahue, Bo Li, Rohit Prabhavalkar16.14
- Multichannel Long-term Streaming Neural Speech Enhancement For Static And Moving Speakers (2024)Changsheng Quan, Xiaofei Li16.05
- Conditional Generative Adversarial Networks For Speech Enhancement And Noise-robust Speaker Verification (2017)Daniel Michelsanti, Zheng-Hua Tan16.03
- T-GSA: Transformer With Gaussian-weighted Self-attention For Speech Enhancement (2019)Jaeyoung Kim, Mostafa El-Khamy, Jungwon Lee15.95
- Speech Emotion Recognition With Dual-sequence LSTM Architecture (2019)Jianyou Wang, Michael Xue, Ryan Culhane, et al.15.78
- Audio-visual Speech Codecs: Rethinking Audio-visual Speech Enhancement By Re-synthesis (2022)Karren Yang, Dejan Markovic, Steven Krenn, et al.15.58
- Complex Spectrogram Enhancement By Convolutional Neural Network With Multi-metrics Learning (2017)Szu-Wei Fu, Ting-Yao Hu, Yu Tsao, et al.15.57
- Learning Problem-agnostic Speech Representations From Multiple Self-supervised Tasks (2019)Santiago Pascual, Mirco Ravanelli, Joan Serrà, et al.15.54
- Mp-senet: A Speech Enhancement Model With Parallel Denoising Of Magnitude And Phase Spectra (2023)Ye-Xin Lu, Yang Ai, Zhen-Hua Ling15.51
- Lighthubert: Lightweight And Configurable Speech Representation Learning With Once-for-all Hidden-unit BERT (2022)Rui Wang, Qibing Bai, Junyi Ao, et al.15.51
- Storm: A Diffusion-based Stochastic Regeneration Model For Speech Enhancement And Dereverberation (2022)Jean-Marie Lemercier, Julius Richter, Simon Welker, et al.15.43
- VERSA: A Versatile Evaluation Toolkit For Speech, Audio, And Music (2024)Jiatong Shi, Hye-Jin Shim, Jinchuan Tian, et al.15.28
- Large-scale Self-supervised Speech Representation Learning For Automatic Speaker Verification (2021)Zhengyang Chen, Sanyuan Chen, Yu Wu, et al.15.25
- CMGAN: Conformer-based Metric GAN For Speech Enhancement (2022)Ruizhe Cao, Sherif Abdulatif, Bin Yang15.13
- Fullsubnet+: Channel Attention Fullsubnet With Complex Spectrograms For Speech Enhancement (2022)Jun Chen, Zilin Wang, Deyi Tuo, et al.15.10
- Distilhubert: Speech Representation Learning By Layer-wise Distillation Of Hidden-unit BERT (2021)Heng-Jui Chang, Shu-Wen Yang, Hung-Yi Lee15.06
- Statistical Speech Enhancement Based On Probabilistic Integration Of Variational Autoencoder And Non-negative Matrix Factorization (2017)Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, et al.15.00
- Dual-branch Attention-in-attention Transformer For Single-channel Speech Enhancement (2021)Guochen Yu, Andong Li, Chengshi Zheng, et al.14.83
- CMGAN: Conformer-based Metric-gan For Monaural Speech Enhancement (2022)Sherif Abdulatif, Ruizhe Cao, Bin Yang14.80
- Time Domain Audio Visual Speech Separation (2019)Jian Wu, Yong Xu, Shi-Xiong Zhang, et al.14.62
- Gated Recurrent Fusion With Joint Training Framework For Robust End-to-end Speech Recognition (2020)Cunhang Fan, Jiangyan Yi, Jianhua Tao, et al.14.55
- Weighted Speech Distortion Losses For Neural-network-based Real-time Speech Enhancement (2020)Yangyang Xia, Sebastian Braun, Chandan K. A. Reddy, et al.14.51
- Contextual Audio-visual Switching For Speech Enhancement In Real-world Environments (2018)Ahsan Adeel, Mandar Gogate, Amir Hussain14.35
- DPCRN: Dual-path Convolution Recurrent Network For Single Channel Speech Enhancement (2021)Xiaohuai Le, Hongsheng Chen, Kai Chen, et al.14.35
- Speech Enhancement Using Multi-stage Self-attentive Temporal Convolutional Networks (2021)Ju Lin, Adriaan J. van Wijngaarden, Kuang-Ching Wang, et al.14.15
- The Partialspoof Database And Countermeasures For The Detection Of Short Fake Speech Segments Embedded In An Utterance (2022)Lin Zhang, Xin Wang, Erica Cooper, et al.14.06
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, et al.14.02