Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation
2024 Β· Cheng Lu, Yuan Zong, Hailun Lian, et al.
Abstract
In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers. Consequently, when the trained model is confronted with data from new speakers, its performance tends to degrade. To address the issue, we propose a Dynamic Joint Distribution Adaptation (DJDA) method under the framework of multi-source domain adaptation. DJDA firstly utilizes joint distribution adaptation (JDA), involving marginal distribution adaptation (MDA) and conditional distribution adaptation (CDA), to more precisely measure the multi-domain distribution shifts caused by different speakers. This helps eliminate speaker bias in emotion features, allowing for learning discriminative and speaker-invariant speech emotion features from coarse-level to fine-level. Furthermore, we quantify the adaptation contributions of MDA and CDA within JDA by using a dynam
Authors
(none)
Tags
Stats
Related papers
- Transferable Positive/negative Speech Emotion Recognition Via Class-wise Adversarial Domain Adaptation (2018)9.23
- Domain Adversarial Learning For Emotion Recognition (2019)0.00
- Adversarial Training For Multi-domain Speaker Recognition (2020)6.77
- Self Supervised Adversarial Domain Adaptation For Cross-corpus And Cross-language Speech Emotion Recognition (2022)13.11
- Layer-adapted Implicit Distribution Alignment Networks For Cross-corpus Speech Emotion Recognition (2023)4.52
- DNN-HMM Based Speaker Adaptive Emotion Recognition Using Proposed Epoch And MFCC Features (2018)14.11
- Optimal Transport-based Adaptation In Dysarthric Speech Tasks (2021)0.00
- Acted Vs. Improvised: Domain Adaptation For Elicitation Approaches In Audio-visual Emotion Recognition (2021)0.00