Predicting Detection Filters For Small Footprint Open-vocabulary Keyword Spotting
2019 Β· Theodore Bluche, Thibault Gisselbrecht
Abstract
In this paper, we propose a fully-neural approach to open-vocabulary keyword spotting, that allows the users to include a customizable voice interface to their device and that does not require task-specific data. We present a keyword detection neural network weighing less than 250KB, in which the topmost layer performing keyword detection is predicted by an auxiliary network, that may be run offline to generate a detector for any keyword. We show that the proposed model outperforms acoustic keyword spotting baselines by a large margin on two tasks of detecting keywords in utterances and three tasks of detecting isolated speech commands. We also propose a method to fine-tune the model when specific training data is available for some keywords, which yields a performance similar to a standard speech command neural network while keeping the ability of the model to be applied to new keywords.
Authors
(none)
Tags
Stats
Related papers
- Small-footprint Open-vocabulary Keyword Spotting With Quantized LSTM Networks (2020)0.00
- Neural Architecture Search For Keyword Spotting (2020)10.61
- Deep Residual Learning For Small-footprint Keyword Spotting (2017)16.21
- An End-to-end Architecture For Keyword Spotting And Voice Activity Detection (2016)0.00
- Efficient Keyword Spotting Using Time Delay Neural Networks (2018)10.21
- A Separable Temporal Convolution Neural Network With Attention For Small-footprint Keyword Spotting (2021)0.00
- Small-footprint Keyword Spotting With Graph Convolutional Network (2019)10.48
- Depthwise Separable Convolutional Resnet With Squeeze-and-excitation Blocks For Small-footprint Keyword Spotting (2020)11.29