Direction Of Arrival Estimation Of Noisy Speech Using Convolutional Recurrent Neural Networks With Higher-order Ambisonics Signals
2021 Β· Nils Poschadel, Robert Hupke, Stephan Preihs, et al.
Abstract
Training convolutional recurrent neural networks on first-order Ambisonics signals is a well-known approach when estimating the direction of arrival for speech/sound signals. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation performance of convolutional recurrent neural networks. While our results on data based on simulated spatial room impulse responses show that the use of higher Ambisonics orders does have the potential to provide better localization results, no further improvement was shown on data based on real spatial room impulse responses from order two onwards. Rather, it seems to be crucial to extract meaningful features from the raw data. First order features derived from the acoustic intensity vector were superior to pure higher-order magnitude and phase features in almost all scenarios.
Authors
(none)
Tags
Stats
Related papers
- Dilated U-net Based Approach For Multichannel Speech Enhancement From First-order Ambisonics Recordings (2020)0.00
- Blind Estimation Of Sub-band Acoustic Parameters From Ambisonics Recordings Using Spectro-spatial Covariance Features (2024)4.52
- Multi-speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals (2018)18.46
- Saladnet: Self-attentive Multisource Localization In The Ambisonics Domain (2021)7.50
- Convolutive Prediction For Monaural Speech Dereverberation And Noisy-reverberant Speaker Separation (2021)11.39
- Efficient Multi-channel Speech Enhancement With Spherical Harmonics Injection For Directional Encoding (2023)3.58
- Audio Inputs For Active Speaker Detection And Localization Via Microphone Array (2023)0.00
- Multi-speaker Localization Using Convolutional Neural Network Trained With Noise (2017)0.00