Quantization Of Acoustic Model Parameters In Automatic Speech Recognition Framework
2020 Β· Amrutha Prasad, Petr Motlicek, Srikanth Madikeri
Abstract
State-of-the-art hybrid automatic speech recognition (ASR) system exploits deep neural network (DNN) based acoustic models (AM) trained with Lattice Free-Maximum Mutual Information (LF-MMI) criterion and n-gram language models. The AMs typically have millions of parameters and require significant parameter reduction to operate on embedded devices. The impact of parameter quantization on the overall word recognition performance is studied in this paper. Following approaches are presented: (i) AM trained in Kaldi framework with conventional factorized TDNN (TDNN-F) architecture, (ii) the TDNN AM built in Kaldi loaded into the PyTorch toolkit using a C++ wrapper for post-training quantization, (iii) quantization-aware training in PyTorch for Kaldi TDNN model, (iv) quantization-aware training in Kaldi. Results obtained on standard Librispeech setup provide an interesting overview of recognition accuracy w.r.t. applied quantization scheme.
Authors
(none)
Tags
Stats
Related papers
- Mixed Precision Of Quantization Of Transformer Language Models For Speech Recognition (2021)8.09
- A Model For Every User And Budget: Label-free And Personalized Mixed-precision Quantization (2023)0.00
- Usm-lite: Quantization And Sparsity Aware Fine-tuning For Speech Recognition With Universal Speech Models (2023)4.52
- Optimization Of Dnn-based Speaker Verification Model Through Efficient Quantization Technique (2024)0.00
- Towards Lightweight Speaker Verification Via Adaptive Neural Network Quantization (2024)5.84
- Stablequant: Layer Adaptive Post-training Quantization For Speech Foundation Models (2025)2.26
- Dq-whisper: Joint Distillation And Quantization For Efficient Multilingual Speech Recognition (2023)4.52
- Residual Adapters For Parameter-efficient ASR Adaptation To Atypical And Accented Speech (2021)10.74