Optimizing Speech Recognition For The Edge
2019 Β· Yuan Shangguan, Jian Li, Qiao Liang, et al.
Abstract
While most deployed speech recognition systems today still run on servers, we are in the midst of a transition towards deployments on edge devices. This leap to the edge is powered by the progression from traditional speech recognition pipelines to end-to-end (E2E) neural architectures, and the parallel development of more efficient neural network topologies and optimization techniques. Thus, we are now able to create highly accurate speech recognizers that are both small and fast enough to execute on typical mobile devices. In this paper, we begin with a baseline RNN-Transducer architecture comprised of Long Short-Term Memory (LSTM) layers. We then experiment with a variety of more computationally efficient layer types, as well as apply optimization techniques like neural connection pruning and parameter quantization to construct a small, high quality, on-device speech recognizer that is an order of magnitude smaller than the baseline system without any optimizations.
Authors
(none)
Tags
Stats
Related papers
- A Review Of On-device Fully Neural End-to-end Automatic Speech Recognition Algorithms (2020)9.92
- Improving RNN Transducer Modeling For End-to-end Speech Recognition (2019)0.00
- Deep Learning Models In Speech Recognition: Measuring GPU Energy Consumption, Impact Of Noise And Model Quantization For Edge Deployment (2024)0.00
- Streaming End-to-end Speech Recognition For Mobile Devices (2018)18.87
- Pi-whisper: Designing An Adaptive And Incremental Automatic Speech Recognition System For Edge Devices (2024)0.00
- Personalized Speech Recognition On Mobile Devices (2016)15.37
- Speech Enhancement Deep-learning Architecture For Efficient Edge Processing (2024)0.00
- Tiny-align: Bridging Automatic Speech Recognition And Large Language Model On The Edge (2024)0.00