Exponential Moving Average Model In Parallel Speech Recognition Training
2017 Β· Xu Tian, Jun Zhang, Zejun Ma, et al.
Abstract
As training data rapid growth, large-scale parallel training with multi-GPUs cluster is widely applied in the neural network model learning currently.We present a new approach that applies exponential moving average method in large-scale parallel training of neural network model. It is a non-interference strategy that the exponential moving average model is not broadcasted to distributed workers to update their local models after model synchronization in the training process, and it is implemented as the final model of the training system. Fully-connected feed-forward neural networks (DNNs) and deep unidirectional Long short-term memory (LSTM) recurrent neural networks (RNNs) are successfully trained with proposed method for large vocabulary continuous speech recognition on Shenma voice search data in Mandarin. The character error rate (CER) of Mandarin speech recognition further degrades than state-of-the-art approaches of parallel training.
Authors
(none)
Tags
Stats
Related papers
- Deep LSTM For Large Vocabulary Continuous Speech Recognition (2017)14.58
- Distributed Training Of Deep Neural Network Acoustic Models For Automatic Speech Recognition (2020)0.00
- Applying GPGPU To Recurrent Neural Network Language Model Based Fast Network Search In The Real-time LVCSR (2020)2.26
- Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System (2018)8.60
- UME: Upcycling Mixture-of-experts For Scalable And Efficient Automatic Speech Recognition (2024)2.26
- Large-scale Learning Of Generalised Representations For Speaker Recognition (2022)0.00
- Batch-normalized Joint Training For Dnn-based Distant Speech Recognition (2017)8.82
- Inference Skipping For More Efficient Real-time Speech Enhancement With Parallel Rnns (2022)10.35