Knowledge Distillation For Small-footprint Highway Networks
2016 Β· Liang Lu, Michelle Guo, Steve Renals
Abstract
Deep learning has significantly advanced state-of-the-art of speech recognition in the past few years. However, compared to conventional Gaussian mixture acoustic models, neural network models are usually much larger, and are therefore not very deployable in embedded devices. Previously, we investigated a compact highway deep neural network (HDNN) for acoustic modelling, which is a type of depth-gated feedforward neural network. We have shown that HDNN-based acoustic models can achieve comparable recognition accuracy with much smaller number of model parameters compared to plain deep neural network (DNN) acoustic models. In this paper, we push the boundary further by leveraging on the knowledge distillation technique that is also known as \{\it teacher-student\} training, i.e., we train the compact HDNN model with the supervision of a high accuracy cumbersome model. Furthermore, we also investigate sequence training and adaptation in the context of teacher-student training. Our experim
Authors
(none)
Tags
Stats
Related papers
- Sequence Training And Adaptation Of Highway Deep Neural Networks (2016)3.58
- Distil-dccrn: A Small-footprint DCCRN Leveraging Feature-based Knowledge Distillation In Speech Enhancement (2024)2.26
- Knowledge Distillation From Language Model To Acoustic Model: A Hierarchical Multi-task Learning Approach (2021)3.58
- Distilling Multi-level X-vector Knowledge For Small-footprint Speaker Verification (2023)0.00
- Knowledge Distillation For Singing Voice Detection (2020)5.24
- Application Of Knowledge Distillation To Multi-task Speech Representation Learning (2022)2.26
- Knowledge Distillation For Efficient Audio-visual Video Captioning (2023)0.00
- Distributed Training Of Deep Neural Network Acoustic Models For Automatic Speech Recognition (2020)0.00