Multi-teacher Language-aware Knowledge Distillation For Multilingual Speech Emotion Recognition
2025 · Mehedi Hasan Bijoy, Dejan Porjazovski, Tamás Grósz, et al.
Abstract
Speech Emotion Recognition (SER) is crucial for improving human-computer interaction. Despite strides in monolingual SER, extending them to build a multilingual system remains challenging. Our goal is to train a single model capable of multilingual SER by distilling knowledge from multiple teacher models. To address this, we introduce a novel language-aware multi-teacher knowledge distillation method to advance SER in English, Finnish, and French. It leverages Wav2Vec2.0 as the foundation of monolingual teacher models and then distills their knowledge into a single multilingual student model. The student model demonstrates state-of-the-art performance, with a weighted recall of 72.9 on the English dataset and an unweighted recall of 63.4 on the Finnish dataset, surpassing fine-tuning and knowledge distillation baselines. Our method excels in improving recall for sad and neutral emotions, although it still faces challenges in recognizing anger and happiness.
Authors
(none)
Tags
Stats
Related papers
- Multi-level Knowledge Distillation For Speech Emotion Recognition In Noisy Conditions (2023)7.81
- Hierarchical Network With Decoupled Knowledge Distillation For Speech Emotion Recognition (2023)6.77
- Speech Emotion Recognition With Distilled Prosodic And Linguistic Affect Representations (2023)5.24
- Multilingual Speech Emotion Recognition With Multi-gating Mechanism And Neural Architecture Search (2022)2.26
- Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-attention Cues In Multitask Learning (2024)0.00
- Speech Emotion: Investigating Model Representations, Multi-task Learning And Knowledge Distillation (2022)6.34
- Continuous Metric Learning For Transferable Speech Emotion Recognition And Embedding Across Low-resource Languages (2022)0.00
- Speecheq: Speech Emotion Recognition Based On Multi-scale Unified Datasets And Multitask Learning (2022)5.84