Abstract

Far-field speech processing is an important and challenging problem. In this paper, we propose \textit\{deep ad-hoc beamforming\}, a deep-learning-based multichannel speech enhancement framework based on ad-hoc microphone arrays, to address the problem. It contains three novel components. First, it combines \textit\{ad-hoc microphone arrays\} with deep-learning-based multichannel speech enhancement, which reduces the probability of the occurrence of far-field acoustic environments significantly. Second, it groups the microphones around the speech source to a local microphone array by a supervised channel selection framework based on deep neural networks. Third, it develops a simple time synchronization framework to synchronize the channels that have different time delay. Besides the above novelties and advantages, the proposed model is also trained in a single-channel fashion, so that it can easily employ new development of speech processing techniques. Its test stage is also flexible

Authors

(none)

Tags

  • Speech Enhancement

Stats

  • citations18
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score9.59
  • arxiv keyzhang2018deep

Related papers