MM-TTS: Multi-modal Prompt Based Style Transfer For Expressive Text-to-speech Synthesis
2023 Β· Wenhao Guan, Yishuang Li, Tao Li, et al.
Abstract
The style transfer task in Text-to-Speech refers to the process of transferring style information into text content to generate corresponding speech with a specific style. However, most existing style transfer approaches are either based on fixed emotional labels or reference speech clips, which cannot achieve flexible style transfer. Recently, some methods have adopted text descriptions to guide style transfer. In this paper, we propose a more flexible multi-modal and style controllable TTS framework named MM-TTS. It can utilize any modality as the prompt in unified multi-modal prompt space, including reference speech, emotional facial images, and text descriptions, to control the style of the generated speech in a system. The challenges of modeling such a multi-modal style controllable TTS mainly lie in two aspects:1)aligning the multi-modal information into a unified style space to enable the input of arbitrary modality as the style prompt in a single system, and 2)efficiently trans
Authors
(none)
Tags
Stats
Related papers
- Promptstyle: Controllable Style Transfer For Text-to-speech With Natural Language Descriptions (2023)10.85
- Text-driven Emotional Style Control And Cross-speaker Style Transfer In Neural TTS (2022)7.81
- Instructtts: Modelling Expressive TTS In Discrete Latent Space With Natural Language Style Prompt (2023)0.00
- Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios (2021)6.77
- Stylefusion TTS: Multimodal Style-control And Enhanced Feature Fusion For Zero-shot Text-to-speech Synthesis (2024)6.34
- Styletts: A Style-based Generative Model For Natural And Diverse Text-to-speech Synthesis (2022)10.97
- Msemotts: Multi-scale Emotion Transfer, Prediction, And Control For Emotional Speech Synthesis (2022)13.97
- Boosting Multi-speaker Expressive Speech Synthesis With Semi-supervised Contrastive Learning (2023)5.24