Duration-aware Pause Insertion Using Pre-trained Language Model For Multi-speaker Text-to-speech
2023 Β· Dong Yang, Tomoki Koriyama, Yuki Saito, et al.
Abstract
Pause insertion, also known as phrase break prediction and phrasing, is an essential part of TTS systems because proper pauses with natural duration significantly enhance the rhythm and intelligibility of synthetic speech. However, conventional phrasing models ignore various speakers' different styles of inserting silent pauses, which can degrade the performance of the model trained on a multi-speaker speech corpus. To this end, we propose more powerful pause insertion frameworks based on a pre-trained language model. Our approach uses bidirectional encoder representations from transformers (BERT) pre-trained on a large-scale text corpus, injecting speaker embedding to capture various speaker characteristics. We also leverage duration-aware pause insertion for more natural multi-speaker TTS. We develop and evaluate two types of models. The first improves conventional phrasing models on the position prediction of respiratory pauses (RPs), i.e., silent pauses at word transitions without
Authors
(none)
Tags
Stats
Related papers
- Pausespeech: Natural Speech Synthesis Via Pre-trained Language Model And Pause-based Prosody Modeling (2023)2.26
- An Investigation Of Phrase Break Prediction In An End-to-end TTS System (2023)0.00
- Assessing Phrase Break Of ESL Speech With Pre-trained Language Models And Large Language Models (2023)0.00
- Leveraging The Interplay Between Syntactic And Acoustic Cues For Optimizing Korean TTS Pause Formation (2024)0.00
- Cross-lingual Transfer Learning For Phrase Break Prediction With Multilingual Language Model (2023)0.00
- Improving Robustness Of Spontaneous Speech Synthesis With Linguistic Speech Regularization And Pseudo-filled-pause Insertion (2022)2.26
- Modeling Prosodic Phrasing With Multi-task Learning In Tacotron-based TTS (2020)9.41
- Simple And Effective Multi-sentence TTS With Expressive And Coherent Prosody (2022)7.16