Three-stage Speaker Verification Architecture In Emotional Talking Environments
2018 Β· Ismail Shahin, Ali Bou Nassif
Abstract
Speaker verification performance in neutral talking environment is usually high, while it is sharply decreased in emotional talking environments. This performance degradation in emotional environments is due to the problem of mismatch between training in neutral environment while testing in emotional environments. In this work, a three-stage speaker verification architecture has been proposed to enhance speaker verification performance in emotional environments. This architecture is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage. The proposed framework has been evaluated on two distinct and independent emotional speech datasets: in-house dataset and Emotional Prosody Speech and Transcripts dataset. Our results show that speaker verification based on both gender information and emotion information is superior to each of speaker verification based on gender information only, emotion info
Authors
(none)
Tags
Stats
Related papers
- Speaker Verification In Emotional Talking Environments Based On Three-stage Framework (2018)6.34
- Speaker Verification In Emotional Talking Environments Based On Third-order Circular Suprasegmental Hidden Markov Model (2019)4.52
- Improving Speaker Verification Robustness With Synthetic Emotional Utterances (2024)0.00
- Identifying Speakers Using Their Emotion Cues (2018)10.85
- Novel Hybrid DNN Approaches For Speaker Verification In Emotional And Stressful Talking Environments (2021)10.85
- Novel Cascaded Gaussian Mixture Model-deep Neural Network Classifier For Speaker Identification In Emotional Talking Environments (2018)12.74
- Analysis Of Speech Separation Performance Degradation On Emotional Speech Mixtures (2023)0.00
- Emotion Invariant Speaker Embeddings For Speaker Identification With Emotional Speech (2020)0.00