Deep Learning For Distant Speech Recognition
2017 Β· Mirco Ravanelli
Abstract
Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. Among the other achievements, building computers that understand speech represents a crucial leap towards intelligent machines. Despite the great efforts of the past decades, however, a natural and robust human-machine speech interaction still appears to be out of reach, especially when users interact with a distant microphone in noisy and reverberant environments. The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field. This thesis addresses the latter scenario and proposes some novel techniques, architectures, and algorithms to improve the robustness of distant-talking acoustic models. We first elaborate on methodologies for realistic data contamination, with a particular emphasis on DNN training with simulated data. We then
Authors
(none)
Tags
Stats
Related papers
- A Network Of Deep Neural Networks For Distant Speech Recognition (2017)10.35
- Contaminated Speech Training Methods For Robust DNN-HMM Distant Speech Recognition (2017)4.52
- Automatic Speech Recognition Using Advanced Deep Learning Approaches: A Survey (2024)16.63
- Ensemble Of Jointly Trained Deep Neural Network-based Acoustic Models For Reverberant Speech Recognition (2016)0.00
- STC Speaker Recognition Systems For The Voices From A Distance Challenge (2019)7.81
- A Study Of Enhancement, Augmentation, And Autoencoder Methods For Domain Adaptation In Distant Speech Recognition (2018)8.60
- Frequency Domain Multi-channel Acoustic Modeling For Distant Speech Recognition (2019)9.92
- Distributed Training Of Deep Neural Network Acoustic Models For Automatic Speech Recognition (2020)0.00