Emoformer: A Text-independent Speech Emotion Recognition Using A Hybrid Transformer-cnn Model
2025 Β· Rashedul Hasan, Meher Nigar, Nursadul Mamun, et al.
Abstract
Speech Emotion Recognition is a crucial area of research in human-computer interaction. While significant work has been done in this field, many state-of-the-art networks struggle to accurately recognize emotions in speech when the data is both speech and speaker-independent. To address this limitation, this study proposes, EmoFormer, a hybrid model combining CNNs (CNNs) with Transformer encoders to capture emotion patterns in speech data for such independent datasets. The EmoFormer network was trained and tested using the Expressive Anechoic Recordings of Speech (EARS) dataset, recently released by META. We experimented with two feature extraction techniques: MFCCs and x-vectors. The model was evaluated on different emotion sets comprising 5, 7, 10, and 23 distinct categories. The results demonstrate that the model achieved its best performance with five emotions, attaining an accuracy of 90%, a precision of 0.92, a recall, and an F1-score of 0.91. However, performance decreased as th
Authors
(none)
Tags
Stats
Related papers
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35
- Multi-modal Emotion Recognition By Text, Speech And Video Using Pretrained Transformers (2024)0.00
- Emodiarize: Speaker Diarization And Emotion Identification From Speech Signals Using Convolutional Neural Networks (2023)0.00
- Key-sparse Transformer For Multimodal Speech Emotion Recognition (2021)13.50
- Decoding Emotions: A Comprehensive Multilingual Study Of Speech Models For Speech Emotion Recognition (2023)0.00
- Speech Emotion Recognition Via Cnn-transformer And Multidimensional Attention Mechanism (2024)0.00
- Attention Based Fully Convolutional Network For Speech Emotion Recognition (2018)15.25
- Light-sernet: A Lightweight Fully Convolutional Neural Network For Speech Emotion Recognition (2021)14.90