Leveraged Mel Spectrograms Using Harmonic And Percussive Components In Speech Emotion Recognition
2023 Β· David Hason Rudd, Huan Huo, Guandong Xu
Abstract
Speech Emotion Recognition (SER) affective technology enables the intelligent embedded devices to interact with sensitivity. Similarly, call centre employees recognise customers' emotions from their pitch, energy, and tone of voice so as to modify their speech for a high-quality interaction with customers. This work explores, for the first time, the effects of the harmonic and percussive components of Mel spectrograms in SER. We attempt to leverage the Mel spectrogram by decomposing distinguishable acoustic features for exploitation in our proposed architecture, which includes a novel feature map generator algorithm, a CNN-based network feature extractor and a multi-layer perceptron (MLP) classifier. This study specifically focuses on effective data augmentation techniques for building an enriched hybrid-based feature map. This process results in a function that outputs a 2D image so that it can be used as input data for a pre-trained CNN-VGG16 feature extractor. Furthermore, we also i
Authors
(none)
Tags
Stats
Related papers
- Improved Speech Emotion Recognition Using Transfer Learning And Spectrogram Augmentation (2021)12.74
- Hybrid Data Augmentation And Deep Attention-based Dilated Convolutional-recurrent Neural Networks For Speech Emotion Recognition (2021)12.81
- Enhanced Speech Emotion Recognition With Efficient Channel Attention Guided Deep Cnn-bilstm Framework (2024)0.00
- Speech Emotion Recognition With Dual-sequence LSTM Architecture (2019)15.78
- Pitch-synchronous Single Frequency Filtering Spectrogram For Speech Emotion Recognition (2019)11.19
- Speech Emotion Recognition Using Deep Sparse Auto-encoder Extreme Learning Machine With A New Weighting Scheme And Spectro-temporal Features Along With Classical Feature Selection And A New Quantum-inspired Dimension Reduction Method (2021)0.00
- Unsupervised Representations Improve Supervised Learning In Speech Emotion Recognition (2023)0.00
- Transforming The Embeddings: A Lightweight Technique For Speech Emotion Recognition Tasks (2023)7.50