GE2E-KWS: Generalized End-to-end Training And Evaluation For Zero-shot Keyword Spotting
2024 Β· Pai Zhu, Jacob W. Bartel, Dhruuv Agarwal, et al.
Abstract
We propose GE2E-KWS -- a generalized end-to-end training and evaluation framework for customized keyword spotting. Specifically, enrollment utterances are separated and grouped by keywords from the training batch and their embedding centroids are compared to all other test utterance embeddings to compute the loss. This simulates runtime enrollment and verification stages, and improves convergence stability and training speed by optimizing matrix operations compared to SOTA triplet loss approaches. To benchmark different models reliably, we propose an evaluation process that mimics the production environment and compute metrics that directly measure keyword matching accuracy. Trained with GE2E loss, our 419KB quantized conformer model beats a 7.5GB ASR encoder by 23.6% relative AUC, and beats a same size triplet loss model by 60.7% AUC. Our KWS models are natively streamable with low memory footprints, and designed to continuously run on-device with no retraining needed for new keywords
Authors
(none)
Tags
Stats
Related papers
- Llm-synth4kws: Scalable Automatic Generation And Synthesis Of Confusable Data For Custom Keyword Spotting (2025)2.26
- Phonmatchnet: Phoneme-guided Zero-shot Keyword Spotting For User-defined Keywords (2023)13.34
- Exploring Sequence-to-sequence Transformer-transducer Models For Keyword Spotting (2022)5.24
- Ctc-aligned Audio-text Embedding For Streaming Open-vocabulary Keyword Spotting (2024)3.58
- Query-by-example Keyword Spotting Using Spectral-temporal Graph Attentive Pooling And Multi-task Learning (2024)0.00
- Online Continual Learning In Keyword Spotting For Low-resource Devices Via Pooling High-order Temporal Statistics (2023)7.50
- Phoneme-level Contrastive Learning For User-defined Keyword Spotting With Flexible Enrollment (2024)6.34
- Exploring Representation Learning For Small-footprint Keyword Spotting (2023)3.58