DPP-TTS: Diversifying Prosodic Features Of Speech Via Determinantal Point Processes
2023 Β· Seongho Joo, Hyukhun Koh, Kyomin Jung
Abstract
With the rapid advancement in deep generative models, recent neural Text-To-Speech(TTS) models have succeeded in synthesizing human-like speech. There have been some efforts to generate speech with various prosody beyond monotonous prosody patterns. However, previous works have several limitations. First, typical TTS models depend on the scaled sampling temperature for boosting the diversity of prosody. Speech samples generated at high sampling temperatures often lack perceptual prosodic diversity, which can adversely affect the naturalness of the speech. Second, the diversity among samples is neglected since the sampling procedure often focuses on a single speech sample rather than multiple ones. In this paper, we propose DPP-TTS: a text-to-speech model based on Determinantal Point Processes (DPPs) with a prosody diversifying module. Our TTS model is capable of generating speech samples that simultaneously consider perceptual diversity in each sample and among multiple samples. We dem
Authors
(none)
Tags
Stats
Related papers
- Diffgan-tts: High-fidelity And Efficient Text-to-speech With Denoising Diffusion Gans (2022)0.00
- Prodiff: Progressive Fast Diffusion Model For High-quality Text-to-speech (2022)0.00
- Diffstyletts: Diffusion-based Hierarchical Prosody Modeling For Text-to-speech With Diverse And Controllable Styles (2024)0.00
- DPI-TTS: Directional Patch Interaction For Fast-converging And Style Temporal Modeling In Text-to-speech (2024)2.26
- Diffprosody: Diffusion-based Latent Prosody Generation For Expressive Speech Synthesis With Prosody Conditional Adversarial Training (2023)10.07
- Phone-level Prosody Modelling With Gmm-based MDN For Diverse And Controllable Speech Synthesis (2021)9.23
- Controllable Neural Text-to-speech Synthesis Using Intuitive Prosodic Features (2020)11.76
- Styletts: A Style-based Generative Model For Natural And Diverse Text-to-speech Synthesis (2022)10.97