Continuous Metric Learning For Transferable Speech Emotion Recognition And Embedding Across Low-resource Languages
2022 · Sneha Das, Nicklas Leander Lund, Nicole Nadine Lønfeldt, et al.
Abstract
Speech emotion recognition~(SER) refers to the technique of inferring the emotional state of an individual from speech signals. SERs continue to garner interest due to their wide applicability. Although the domain is mainly founded on signal processing, machine learning, and deep learning, generalizing over languages continues to remain a challenge. However, developing generalizable and transferable models are critical due to a lack of sufficient resources in terms of data and labels for languages beyond the most commonly spoken ones. To improve performance over languages, we propose a denoising autoencoder with semi-supervision using a continuous metric loss based on either activation or valence. The novelty of this work lies in our proposal of continuous metric learning, which is among the first proposals on the topic to the best of our knowledge. Furthermore, to address the lack of activation and valence labels in the transfer datasets, we annotate the signal samples with activation
Authors
(none)
Tags
Stats
Related papers
- Towards Interpretable And Transferable Speech Emotion Recognition: Latent Representation Based Analysis Of Features, Methods And Corpora (2021)0.00
- Improved Speech Emotion Recognition Using Transfer Learning And Spectrogram Augmentation (2021)12.74
- On The Use Of Self-supervised Pre-trained Acoustic And Linguistic Features For Continuous Speech Emotion Recognition (2020)11.85
- Semi-supervised Cross-lingual Speech Emotion Recognition (2022)10.85
- Emonet: A Transfer Learning Framework For Multi-corpus Speech Emotion Recognition (2021)2.95
- Multilingual Speech Emotion Recognition With Multi-gating Mechanism And Neural Architecture Search (2022)2.26
- Multi-task Semi-supervised Adversarial Autoencoding For Speech Emotion Recognition (2019)14.58
- End-to-end Transfer Learning For Speaker-independent Cross-language And Cross-corpus Speech Emotion Recognition (2023)0.00