Burst2vec: An Adversarial Multi-task Approach For Predicting Emotion, Age, And Origin From Vocal Bursts
2022 Β· Atijit Anuchitanukul, Lucia Specia
Abstract
We present Burst2Vec, our multi-task learning approach to predict emotion, age, and origin (i.e., native country/language) from vocal bursts. Burst2Vec utilises pre-trained speech representations to capture acoustic information from raw waveforms and incorporates the concept of model debiasing via adversarial training. Our models achieve a relative 30 % performance gain over baselines using pre-extracted features and score the highest amongst all participants in the ICML ExVo 2022 Multi-Task Challenge.
Authors
(none)
Tags
Stats
Related papers
- Self-supervision And Learnable Strfs For Age, Emotion, And Country Prediction (2022)0.00
- An Efficient Multitask Learning Architecture For Affective Vocal Burst Analysis (2022)0.00
- Self-supervised Attention Networks And Uncertainty Loss Weighting For Multi-task Emotion Recognition On Vocal Bursts (2022)0.00
- Multitask Vocal Burst Modeling With Resnets And Pre-trained Paralinguistic Conformers (2022)0.00
- Jointly Predicting Emotion, Age, And Country Using Pre-trained Acoustic Embedding (2022)6.77
- Self-relation Attention And Temporal Awareness For Emotion Recognition Via Vocal Burst (2022)4.18
- Dynamic Restrained Uncertainty Weighting Loss For Multitask Learning Of Vocal Expression (2022)0.00
- Classification Of Vocal Bursts For ACII 2022 A-vb-type Competition Using Convolutional Neural Networks And Deep Acoustic Embeddings (2022)0.00