Deep Learning Based Stage-wise Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays
2022 Β· Shupei Liu, Linfeng Feng, Yijun Gong, et al.
Abstract
While deep-learning-based speaker localization has shown advantages in challenging acoustic environments, it often yields only direction-of-arrival (DOA) cues rather than precise two-dimensional (2D) coordinates. To address this, we propose a novel deep-learning-based 2D speaker localization method leveraging ad-hoc microphone arrays, where an ad-hoc microphone array is composed of randomly distributed microphone nodes, each of which is equipped with a traditional array. Specifically, we first employ convolutional neural networks at each node to estimate speaker directions. Then, we integrate these DOA estimates using triangulation and clustering techniques to get 2D speaker locations. To further boost the estimation accuracy, we introduce a node selection algorithm that strategically filters the most reliable nodes. Extensive experiments on both simulated and real-world data demonstrate that our approach significantly outperforms conventional methods. The proposed node selection furth
Authors
(none)
Tags
Stats
Related papers
- Deep Ad-hoc Beamforming Based On Speaker Extraction For Target-dependent Speech Separation (2020)7.50
- Deep Ad-hoc Beamforming (2018)9.59
- Audio Inputs For Active Speaker Detection And Localization Via Microphone Array (2023)0.00
- Multi-speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals (2018)18.46
- Neural Directed Speech Enhancement With Dual Microphone Array In High Noise Scenario (2024)0.00
- Deep Learning Based Multi-source Localization With Source Splitting And Its Effectiveness In Multi-talker Speech Recognition (2021)14.23
- Multi-geometry Spatial Acoustic Modeling For Distant Speech Recognition (2019)6.34
- Leveraging Visual Supervision For Array-based Active Speaker Detection And Localization (2023)6.77