Towards Building Text-to-speech Systems For The Next Billion Users
2022 Β· Gokul Karthik Kumar, Praveen S, Pratyush Kumar, et al.
Abstract
Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively lower resource availability, and the diverse set of advances in neural TTS that remain untested. In this paper, we evaluate the choice of acoustic models, vocoders, supplementary loss functions, training schedules, and speaker and language diversity for Dravidian and Indo-Aryan languages. Based on this, we identify monolingual models with FastPitch and HiFi-GAN V1, trained jointly on male and female speakers to perform the best. With this setup, we train and evaluate TTS models for 13 languages and find our models to significantly improve upon existing models in all languages as measured by mea
Authors
(none)
Tags
Stats
Related papers
- Generic Indic Text-to-speech Synthesisers With Rapid Adaptation In An End-to-end Framework (2020)8.82
- A Unified Framework For Collecting Text-to-speech Synthesis Datasets For 22 Indian Languages (2024)0.00
- Towards Developing State-of-the-art TTS Synthesisers For 13 Indian Languages With Signal Processing Aided Alignments (2022)0.00
- Rapid Speaker Adaptation In Low Resource Text To Speech Systems Using Synthetic Data And Transfer Learning (2023)0.00
- Indicvoices-r: Unlocking A Massive Multilingual Multi-speaker Speech Corpus For Scaling Indian TTS (2024)2.26
- Empowering Global Voices: A Data-efficient, Phoneme-tone Adaptive Approach To High-fidelity Speech Synthesis (2025)0.00
- Fast And Small Footprint Hybrid Hmm-hifigan Based System For Speech Synthesis In Indian Languages (2023)0.00
- Enhancing Out-of-vocabulary Performance Of Indian TTS Systems For Practical Applications Through Low-effort Data Strategies (2024)0.00