Rec-rir: Monaural Blind Room Impulse Response Identification Via Dnn-based Reverberant Speech Reconstruction In STFT Domain
2025 Β· Pengyu Wang, Xiaofei Li
Abstract
This paper presents Rec-RIR for monaural blind room impulse response (RIR) identification. Rec-RIR is developed based on the convolutive transfer function (CTF) approximation, which models reverberation effect within narrow-band filter banks in the short-time Fourier transform domain. Specifically, we propose a deep neural network (DNN) with cross-band and narrow-band blocks to estimate the CTF filter. The DNN is trained through reconstructing the noise-free reverberant speech spectra. This objective enables stable and straightforward supervised training. Subsequently, a pseudo intrusive measurement process is employed to convert the CTF filter estimate into RIR by simulating a common intrusive RIR measurement procedure. Experimental results demonstrate that Rec-RIR achieves state-of-the-art performance in both RIR identification and acoustic parameter estimation. Open-source codes are available online at https://github.com/Audio-WestlakeU/Rec-RIR.
Authors
(none)
Tags
Stats
Code
Related papers
- Towards Improved Room Impulse Response Estimation For Speech Recognition (2022)10.61
- AV-RIR: Audio-visual Room Impulse Response Estimation (2023)0.00
- RIR-SF: Room Impulse Response Based Spatial Feature For Target Speech Recognition In Multi-channel Multi-speaker Scenarios (2023)0.00
- TS-RIR: Translated Synthetic Room Impulse Responses For Speech Augmentation (2021)8.35
- Deep Convolutional Neural Network-based Inverse Filtering Approach For Speech De-reverberation (2020)7.16
- IR-GAN: Room Impulse Response Generator For Far-field Speech Recognition (2020)11.93
- Ensemble Of Jointly Trained Deep Neural Network-based Acoustic Models For Reverberant Speech Recognition (2016)0.00
- Convolutive Prediction For Monaural Speech Dereverberation And Noisy-reverberant Speaker Separation (2021)11.39