Optimizing Dysarthria Wake-up Word Spotting: An End-to-end Approach For SLT 2024 LRDWWS Challenge
2024 Β· Shuiyun Liu, Yuxiang Kong, Pengcheng Guo, et al.
Abstract
Speech has emerged as a widely embraced user interface across diverse applications. However, for individuals with dysarthria, the inherent variability in their speech poses significant challenges. This paper presents an end-to-end Pretrain-based Dual-filter Dysarthria Wake-up word Spotting (PD-DWS) system for the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge. Specifically, our system improves performance from two key perspectives: audio modeling and dual-filter strategy. For audio modeling, we propose an innovative 2branch-d2v2 model based on the pre-trained data2vec2 (d2v2), which can simultaneously model automatic speech recognition (ASR) and wake-up word spotting (WWS) tasks through a unified multi-task finetuning paradigm. Additionally, a dual-filter strategy is introduced to reduce the false accept rate (FAR) while maintaining the same false reject rate (FRR). Experimental results demonstrate that our PD-DWS system achieves an FAR of 0.00321 and an FRR of 0.005,
Authors
(none)
Tags
Stats
Related papers
- PB-LRDWWS System For The SLT 2024 Low-resource Dysarthria Wake-up Word Spotting Challenge (2024)0.00
- Dual-attention Neural Transducers For Efficient Wake Word Spotting In Speech Recognition (2023)5.24
- Robust Wake Word Spotting With Frame-level Cross-modal Attention Based Audio-visual Conformer (2024)5.24
- Learning To Detect Dysarthria From Raw Speech (2018)11.85
- Training Data Augmentation For Dysarthric Automatic Speech Recognition By Text-to-dysarthric-speech Synthesis (2024)10.48
- Federated Learning For Keyword Spotting (2018)17.09
- Enhancing Dysarthric Speech Recognition For Unseen Speakers Via Prototype-based Adaptation (2024)9.45
- Training Wake Word Detection With Synthesized Speech Data On Confusion Words (2020)0.00