Multiview Canonical Correlation Analysis For Automatic Pathological Speech Detection
2024 Β· Yacouba Kaloga, Shakeel A. Sheikh, Ina Kodrasi
Abstract
Recently proposed automatic pathological speech detection approaches rely on spectrogram input representations or wav2vec2 embeddings. These representations may contain pathology irrelevant uncorrelated information, such as changing phonetic content or variations in speaking style across time, which can adversely affect classification performance. To address this issue, we propose to use Multiview Canonical Correlation Analysis (MCCA) on these input representations prior to automatic pathological speech detection. Our results demonstrate that unlike other dimensionality reduction techniques, the use of MCCA leads to a considerable improvement in pathological speech detection performance by eliminating uncorrelated information present in the input representations. Employing MCCA with traditional classifiers yields a comparable or higher performance than using sophisticated architectures, while preserving the representation structure and providing interpretability.
Authors
(none)
Tags
Stats
Related papers
- Towards Robust Voice Pathology Detection (2019)13.74
- MLCA-AVSR: Multi-layer Cross Attention Fusion Based Audio-visual Speech Recognition (2024)10.07
- Multi-view Dimensionality Reduction For Dialect Identification Of Arabic Broadcast Speech (2016)0.00
- Multi-view Multi-task Representation Learning For Mispronunciation Detection (2023)0.00
- Analyzing Utility Of Visual Context In Multimodal Speech Recognition Under Noisy Conditions (2019)0.00
- Coca-mdd: A Coupled Cross-attention Based Framework For Streaming Mispronunciation Detection And Diagnosis (2021)5.84
- Comparative Layer-wise Analysis Of Self-supervised Speech Models (2022)0.00
- Prediction Of Head Motion From Speech Waveforms With A Canonical-correlation-constrained Autoencoder (2020)5.24