Sequence Discriminative Training For Deep Learning Based Acoustic Keyword Spotting
2018 Β· Zhehuai Chen, Yanmin Qian, Kai Yu
Abstract
Speech recognition is a sequence prediction problem. Besides employing various deep learning approaches for framelevel classification, sequence-level discriminative training has been proved to be indispensable to achieve the state-of-the-art performance in large vocabulary continuous speech recognition (LVCSR). However, keyword spotting (KWS), as one of the most common speech recognition tasks, almost only benefits from frame-level deep learning due to the difficulty of getting competing sequence hypotheses. The few studies on sequence discriminative training for KWS are limited for fixed vocabulary or LVCSR based methods and have not been compared to the state-of-the-art deep learning based KWS approaches. In this paper, a sequence discriminative training framework is proposed for both fixed vocabulary and unrestricted acoustic KWS. Sequence discriminative training for both sequence-level generative and discriminative models are systematically investigated. By introducing word-indepen
Authors
(none)
Tags
Stats
Related papers
- Exploring Sequence-to-sequence Transformer-transducer Models For Keyword Spotting (2022)5.24
- Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech (2024)0.00
- Phoneme-level Contrastive Learning For User-defined Keyword Spotting With Flexible Enrollment (2024)6.34
- Exploring Representation Learning For Small-footprint Keyword Spotting (2023)3.58
- DCCRN-KWS: An Audio Bias Based Model For Noise Robust Small-footprint Keyword Spotting (2023)5.24
- Contrastive Augmentation: An Unsupervised Learning Approach For Keyword Spotting In Speech Technology (2024)9.92
- Streaming Small-footprint Keyword Spotting Using Sequence-to-sequence Models (2017)12.40
- Small-footprint Keyword Spotting Using Deep Neural Network And Connectionist Temporal Classifier (2017)0.00