Generating Speakers By Prompting Listener Impressions For Pre-trained Multi-speaker Text-to-speech Systems
2024 Β· Zhengyang Chen, Xuechen Liu, Erica Cooper, et al.
Abstract
This paper proposes a speech synthesis system that allows users to specify and control the acoustic characteristics of a speaker by means of prompts describing the speaker's traits of synthesized speech. Unlike previous approaches, our method utilizes listener impressions to construct prompts, which are easier to collect and align more naturally with everyday descriptions of speaker traits. We adopt the Low-rank Adaptation (LoRA) technique to swiftly tailor a pre-trained language model to our needs, facilitating the extraction of speaker-related traits from the prompt text. Besides, different from other prompt-driven text-to-speech (TTS) systems, we separate the prompt-to-speaker module from the multi-speaker TTS system, enhancing system flexibility and compatibility with various pre-trained multi-speaker TTS systems. Moreover, for the prompt-to-speaker characteristic module, we also compared the discriminative method and flow-matching based generative method and we found that combinin
Authors
(none)
Tags
Stats
Related papers
- Prompttts++: Controlling Speaker Identity In Prompt-based Text-to-speech Using Natural Language Descriptions (2023)9.23
- Expressive Prompting: Improving Emotion Intensity And Speaker Consistency In Zero-shot TTS (2024)0.00
- Retrieval Augmented Generation In Prompt-based Text-to-speech Synthesis With Context-aware Contrastive Language-audio Pretraining (2024)0.00
- Building Speech Corpus With Diverse Voice Characteristics For Its Prompt-based Representation (2024)0.00
- Speak, Read And Prompt: High-fidelity Text-to-speech With Minimal Supervision (2023)0.00
- PROEMO: Prompt-driven Text-to-speech Synthesis Based On Emotion And Intensity Control (2025)0.00
- Stable-tts: Stable Speaker-adaptive Text-to-speech Synthesis Via Prosody Prompting (2024)4.52
- Speechgen: Unlocking The Generative Power Of Speech Language Models With Prompts (2023)0.00