Pi-whisper: Designing An Adaptive And Incremental Automatic Speech Recognition System For Edge Devices
2024 Β· Amir Nassereldine, Dancheng Liu, Chenhui Xu, et al.
Abstract
Edge-based automatic speech recognition (ASR) technologies are increasingly prevalent in the development of intelligent and personalized assistants. However, resource-constrained ASR models face significant challenges in adaptivity, incrementality, and inclusivity when faced with a diverse population. To tackle those challenges, we propose PI-Whisper, a novel ASR system that adaptively enhances recognition capabilities by identifying speakers' characteristics in real-time. In this work, we show how the design of PI-Whisper allows for incremental adaptation of new characteristics without the need for repetitive retraining, enhances recognition capabilities, and improves equity and fairness across diverse speaker groups. PI-Whisper demonstrates these advantages by achieving state-of-the-art accuracy, reducing the word error rate (WER) by up to 13.7% relative to baselines while scaling linearly to computing resources.
Authors
(none)
Tags
Stats
Related papers
- M2r-whisper: Multi-stage And Multi-scale Retrieval Augmentation For Enhancing Whisper (2024)6.77
- Whisper-lm: Improving ASR Models With Language Models For Low-resource Languages (2025)3.29
- Optimizing Speech Recognition For The Edge (2019)0.00
- Dyn-asr: Compact, Multilingual Speech Recognition Via Spoken Language And Accent Identification (2021)5.24
- Multilingual Distilwhisper: Efficient Distillation Of Multi-task Speech Models Via Language-specific Experts (2023)8.09
- Sparsely Shared Lora On Whisper For Child Speech Recognition (2023)9.59
- Adapting Whisper For Code-switching Through Encoding Refining And Language-aware Decoding (2024)0.00
- Tiny-align: Bridging Automatic Speech Recognition And Large Language Model On The Edge (2024)0.00