Tts-by-tts: Tts-driven Data Augmentation For Fast And High-quality Speech Synthesis
2020 Β· Min-Jae Hwang, Ryuichi Yamamoto, Eunwoo Song, et al.
Abstract
In this paper, we propose a text-to-speech (TTS)-driven data augmentation method for improving the quality of a non-autoregressive (AR) TTS system. Recently proposed non-AR models, such as FastSpeech 2, have successfully achieved fast speech synthesis system. However, their quality is not satisfactory, especially when the amount of training data is insufficient. To address this problem, we propose an effective data augmentation method using a well-designed AR TTS system. In this method, large-scale synthetic corpora including text-waveform pairs with phoneme duration are generated by the AR TTS system and then used to train the target non-AR model. Perceptual listening test results showed that the proposed method significantly improved the quality of the non-AR TTS system. In particular, we augmented five hours of a training database to 179 hours of a synthetic one. Using these databases, our TTS system consisting of a FastSpeech 2 acoustic model with a Parallel WaveGAN vocoder achieve
Authors
(none)
Tags
Stats
Related papers
- Improving Accented Speech Recognition Using Data Augmentation Based On Unsupervised Text-to-speech Synthesis (2024)0.00
- You Do Not Need More Data: Improving End-to-end Speech Recognition By Text-to-speech Data Augmentation (2020)11.49
- Generating Synthetic Audio Data For Attention-based Speech Recognition Systems (2019)12.68
- Low-resource Expressive Text-to-speech Using Data Augmentation (2020)11.29
- Training Data Augmentation For Dysarthric Automatic Speech Recognition By Text-to-dysarthric-speech Synthesis (2024)10.48
- Hmm-based Data Augmentation For E2E Systems For Building Conversational Speech Synthesis Systems (2022)0.00
- Tts-by-tts 2: Data-selective Augmentation For Neural Speech Synthesis Using Ranking Support Vector Machine With Variational Autoencoder (2022)4.52
- Frustratingly Easy Data Augmentation For Low-resource ASR (2025)0.00