End-to-end Streaming Keyword Spotting
2018 Β· Alvarez Raziel, Park Hyun-Jin
Abstract
We present a system for keyword spotting that, except for a frontend component for feature generation, it is entirely contained in a deep neural network (DNN) model trained "end-to-end" to predict the presence of the keyword in a stream of audio. The main contributions of this work are, first, an efficient memoized neural network topology that aims at making better use of the parameters and associated computations in the DNN by holding a memory of previous activations distributed over the depth of the DNN. The second contribution is a method to train the DNN, end-to-end, to produce the keyword spotting score. This system significantly outperforms previous approaches both in terms of quality of detection as well as size and computation.
Authors
(none)
Tags
Stats
Related papers
- Streaming Small-footprint Keyword Spotting Using Sequence-to-sequence Models (2017)12.40
- Heimdal: Highly Efficient Method For Detection And Localization Of Wake-words (2022)3.58
- Efficient Keyword Spotting Using Dilated Convolutions And Gating (2018)13.84
- Improving Vision-inspired Keyword Spotting Using Dynamic Module Skipping In Streaming Conformer Encoder (2023)5.24
- Small-footprint Open-vocabulary Keyword Spotting With Quantized LSTM Networks (2020)0.00
- End-to-end Keyword Spotting Using Neural Architecture Search And Quantization (2021)8.60
- An End-to-end Architecture For Keyword Spotting And Voice Activity Detection (2016)0.00
- Predicting Detection Filters For Small Footprint Open-vocabulary Keyword Spotting (2019)9.92