Evaluating Raw Waveforms With Deep Learning Frameworks For Speech Emotion Recognition
2023 Β· Zeynep Hilal Kilimci, Ulku Bayraktar, Ayhan Kucukmanisa
Abstract
Speech emotion recognition is a challenging task in speech processing field. For this reason, feature extraction process has a crucial importance to demonstrate and process the speech signals. In this work, we represent a model, which feeds raw audio files directly into the deep neural networks without any feature extraction stage for the recognition of emotions utilizing six different data sets, EMO-DB, RAVDESS, TESS, CREMA, SAVEE, and TESS+RAVDESS. To demonstrate the contribution of proposed model, the performance of traditional feature extraction techniques namely, mel-scale spectogram, mel-frequency cepstral coefficients, are blended with machine learning algorithms, ensemble learning methods, deep and hybrid deep learning techniques. Support vector machine, decision tree, naive Bayes, random forests models are evaluated as machine learning algorithms while majority voting and stacking methods are assessed as ensemble learning techniques. Moreover, convolutional neural networks, lo
Authors
(none)
Tags
Stats
Related papers
- Emotion Recognition From Speech (2019)0.00
- Emodiarize: Speaker Diarization And Emotion Identification From Speech Signals Using Convolutional Neural Networks (2023)0.00
- Sigwavnet: Learning Multiresolution Signal Wavelet Network For Speech Emotion Recognition (2025)8.48
- Real-time Speech Emotion Recognition Based On Syllable-level Feature Extraction (2022)8.09
- Direct Modelling Of Speech Emotion From Raw Speech (2019)14.55
- Feature Selection Enhancement And Feature Space Visualization For Speech-based Emotion Recognition (2022)7.50
- An Analysis Of Large Speech Models-based Representations For Speech Emotion Recognition (2023)4.52
- Deep Learning Based Emotion Recognition System Using Speech Features And Transcriptions (2019)0.00