Convmixer: Feature Interactive Convolution With Curriculum Learning For Small Footprint And Noisy Far-field Keyword Spotting
2022 Β· Dianwen Ng, Yunqi Chen, Biao Tian, et al.
Abstract
Building efficient architecture in neural speech processing is paramount to success in keyword spotting deployment. However, it is very challenging for lightweight models to achieve noise robustness with concise neural operations. In a real-world application, the user environment is typically noisy and may also contain reverberations. We proposed a novel feature interactive convolutional model with merely 100K parameters to tackle this under the noisy far-field condition. The interactive unit is proposed in place of the attention module that promotes the flow of information with more efficient computations. Moreover, curriculum-based multi-condition training is adopted to attain better noise robustness. Our model achieves 98.2% top-1 accuracy on Google Speech Command V2-12 and is competitive against large transformer models under the designed noise condition.
Authors
(none)
Tags
Stats
Related papers
- Small Footprint Multi-channel Convmixer For Keyword Spotting With Centroid Based Awareness (2022)8.60
- Neural Architecture Search For Keyword Spotting (2020)10.61
- Depthwise Separable Convolutional Resnet With Squeeze-and-excitation Blocks For Small-footprint Keyword Spotting (2020)11.29
- Autokws: Keyword Spotting With Differentiable Architecture Search (2020)9.92
- A Separable Temporal Convolution Neural Network With Attention For Small-footprint Keyword Spotting (2021)0.00
- A Monaural Speech Enhancement Method For Robust Small-footprint Keyword Spotting (2019)0.00
- Predicting Detection Filters For Small Footprint Open-vocabulary Keyword Spotting (2019)9.92
- Improving Vision-inspired Keyword Spotting Using Dynamic Module Skipping In Streaming Conformer Encoder (2023)5.24