EMOVIE: A Mandarin Emotion Speech Dataset With A Simple Emotional Text-to-speech Model
2021 · Chenye Cui, Yi Ren, Jinglin Liu, et al.
Abstract
Recently, there has been an increasing interest in neural speech synthesis. While the deep neural network achieves the state-of-the-art result in text-to-speech (TTS) tasks, how to generate a more emotional and more expressive speech is becoming a new challenge to researchers due to the scarcity of high-quality emotion speech dataset and the lack of advanced emotional TTS model. In this paper, we first briefly introduce and publicly release a Mandarin emotion speech dataset including 9,724 samples with audio files and its emotion human-labeled annotation. After that, we propose a simple but efficient architecture for emotional speech synthesis called EMSpeech. Unlike those models which need additional reference audio as input, our model could predict emotion labels just from the input text and generate more expressive speech conditioned on the emotion embedding. In the experiment phase, we first validate the effectiveness of our dataset by an emotion classification task. Then we train
Authors
(none)
Tags
Stats
Related papers
- Emospeech: A Corpus Of Emotionally Rich And Contextually Detailed Speech Annotations (2024)0.00
- Construction And Evaluation Of Mandarin Multimodal Emotional Speech Database (2024)0.00
- A Methodology For Controlling The Emotional Expressiveness In Synthetic Speech -- A Deep Learning Approach (2019)5.84
- EMOVOME: A Dataset For Emotion Recognition In Spontaneous Real-life Speech (2024)0.00
- Msemotts: Multi-scale Emotion Transfer, Prediction, And Control For Emotional Speech Synthesis (2022)13.97
- Emotion Controllable Speech Synthesis Using Emotion-unlabeled Dataset With The Assistance Of Cross-domain Speech Emotion Recognition (2020)12.93
- EMNS /imz/ Corpus: An Emotive Single-speaker Dataset For Narrative Storytelling In Games, Television And Graphic Novels (2023)0.00
- Mscenespeech: A Multi-scene Speech Dataset For Expressive Speech Synthesis (2024)0.00