Speaker Verification In Emotional Talking Environments Based On Three-stage Framework
2018 Β· Ismail Shahin
Abstract
This work is dedicated to introducing, executing, and assessing a three-stage speaker verification framework to enhance the degraded speaker verification performance in emotional talking environments. Our framework is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage. The proposed framework has been assessed on two distinct and independent emotional speech datasets: our collected dataset and Emotional Prosody Speech and Transcripts dataset. Our results demonstrate that speaker verification based on both gender cues and emotion cues is superior to each of speaker verification based on gender cues only, emotion cues only, and neither gender cues nor emotion cues. The achieved average speaker verification performance based on the suggested methodology is very similar to that attained in subjective assessment by human listeners.
Authors
(none)
Tags
Stats
Related papers
- Three-stage Speaker Verification Architecture In Emotional Talking Environments (2018)7.16
- Speaker Verification In Emotional Talking Environments Based On Third-order Circular Suprasegmental Hidden Markov Model (2019)4.52
- Improving Speaker Verification Robustness With Synthetic Emotional Utterances (2024)0.00
- Identifying Speakers Using Their Emotion Cues (2018)10.85
- An Ensemble Framework Of Voice-based Emotion Recognition System For Films And TV Programs (2018)9.41
- Novel Hybrid DNN Approaches For Speaker Verification In Emotional And Stressful Talking Environments (2021)10.85
- Gender-dependent Emotion Recognition Based On Hmms And Sphmms (2018)9.59
- Two-stage Framework For Robust Speech Emotion Recognition Using Target Speaker Extraction In Human Speech Noise Conditions (2024)3.58