Sequence To Sequence Neural Speech Synthesis With Prosody Modification Capabilities
2019 Β· Slava Shechtman, Alex Sorin
Abstract
Modern sequence to sequence neural TTS systems provide close to natural speech quality. Such systems usually comprise a network converting linguistic/phonetic features sequence to an acoustic features sequence, cascaded with a neural vocoder. The generated speech prosody (i.e. phoneme durations, pitch and loudness) is implicitly present in the acoustic features, being mixed with spectral information. Although the speech sounds natural, its prosody realization is randomly chosen and cannot be easily altered. The prosody control becomes an even more difficult task if no prosodic labeling is present in the training data. Recently, much progress has been achieved in unsupervised speaking style learning and generation, however human inspection is still required after the training for discovery and interpretation of the speaking styles learned by the system. In this work we introduce a fully automatic method that makes the system aware of the prosody and enables sentence-wise speaking pace a
Authors
(none)
Tags
Stats
Related papers
- Controllable Neural Text-to-speech Synthesis Using Intuitive Prosodic Features (2020)11.76
- Applying Syntax\(\unicode{x2013}\)prosody Mapping Hypothesis And Prosodic Well-formedness Constraints To Neural Sequence-to-sequence Speech Synthesis (2022)0.00
- Hierarchical Prosody Modeling And Control In Non-autoregressive Parallel Neural TTS (2021)8.35
- Prosody Learning Mechanism For Speech Synthesis System Without Text Length Limit (2020)5.84
- Prosody-controllable Spontaneous TTS With Neural Hmms (2022)8.09
- Dynamic Prosody Generation For Speech Synthesis Using Linguistics-driven Acoustic Embedding Selection (2019)7.81
- Controllable Sequence-to-sequence Neural TTS With LPCNET Backend For Real-time Speech Synthesis On CPU (2020)0.00
- Investigation Of Learning Abilities On Linguistic Features In Sequence-to-sequence Text-to-speech Synthesis (2020)8.82