Developing Far-field Speaker System Via Teacher-student Learning
2018 Β· Jinyu Li, Rui Zhao, Zhuo Chen, et al.
Abstract
In this study, we develop the keyword spotting (KWS) and acoustic model (AM) components in a far-field speaker system. Specifically, we use teacher-student (T/S) learning to adapt a close-talk well-trained production AM to far-field by using parallel close-talk and simulated far-field data. We also use T/S learning to compress a large-size KWS model into a small-size one to fit the device computational cost. Without the need of transcription, T/S learning well utilizes untranscribed data to boost the model performance in both the AM adaptation and KWS model compression. We further optimize the models with sequence discriminative training and live data to reach the best performance of systems. The adapted AM improved from the baseline by 72.60% and 57.16% relative word error rate reduction on play-back and live test data, respectively. The final KWS model size was reduced by 27 times from a large-size KWS model without losing accuracy.
Authors
(none)
Tags
Stats
Related papers
- Distilling Knowledge Using Parallel Data For Far-field Speech Recognition (2018)0.00
- Large-scale Domain Adaptation Via Teacher-student Learning (2017)13.93
- Fully Learnable Front-end For Multi-channel Acoustic Modeling Using Semi-supervised Learning (2020)2.26
- Improving Curriculum Learning For Target Speaker Extraction With Synthetic Speakers (2024)2.26
- Multi-level Transfer Learning From Near-field To Far-field Speaker Verification (2021)0.00
- Distance-based Weight Transfer From Near-field To Far-field Speaker Verification (2023)0.00
- Llm-synth4kws: Scalable Automatic Generation And Synthesis Of Confusable Data For Custom Keyword Spotting (2025)2.26
- Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild (2020)0.00