A Comparison Of Adaptation Techniques And Recurrent Neural Network Architectures
2018 Β· Jan Vanek, Josef Michalek, Jan Zelinka, et al.
Abstract
Recently, recurrent neural networks have become state-of-the-art in acoustic modeling for automatic speech recognition. The long short-term memory (LSTM) units are the most popular ones. However, alternative units like gated recurrent unit (GRU) and its modifications outperformed LSTM in some publications. In this paper, we compared five neural network (NN) architectures with various adaptation and feature normalization techniques. We have evaluated feature-space maximum likelihood linear regression, five variants of i-vector adaptation and two variants of cepstral mean normalization. The most adaptation and normalization techniques were developed for feed-forward NNs and, according to results in this paper, not all of them worked also with RNNs. For experiments, we have chosen a well known and available TIMIT phone recognition task. The phone recognition is much more sensitive to the quality of AM than large vocabulary task with a complex language model. Also, we published the open-so
Authors
(none)
Tags
Stats
Related papers
- Cumulative Adaptation For BLSTM Acoustic Models (2019)0.00
- Multilingual Adaptation Of RNN Based ASR Systems (2017)7.50
- Memory Visualization For Gated Recurrent Neural Networks In Speech Recognition (2016)11.76
- Light Gated Recurrent Units For Speech Recognition (2018)18.90
- High Order Recurrent Neural Networks For Acoustic Modelling (2018)8.60
- Long Short-term Memory Based Convolutional Recurrent Neural Networks For Large Vocabulary Speech Recognition (2016)6.77
- Improving Speech Recognition By Revising Gated Recurrent Units (2017)11.19
- Empirical Evaluation Of Speaker Adaptation On DNN Based Acoustic Model (2018)5.24