Universal Adaptor: Converting Mel-spectrograms Between Different Configurations For Speech Synthesis
2022 Β· Fan-Lin Wang, Po-Chun Hsu, da-Rong Liu, et al.
Abstract
Most recent speech synthesis systems are composed of a synthesizer and a vocoder. However, the existing synthesizers and vocoders can only be matched to acoustic features extracted with a specific configuration. Hence, we can't combine arbitrary synthesizers and vocoders together to form a complete system, not to mention apply to a newly developed model. In this paper, we proposed Universal Adaptor, which takes a Mel-spectrogram parametrized by the source configuration and converts it into a Mel-spectrogram parametrized by the target configuration, as long as we feed in the source and the target configurations. Experiments show that the quality of speeches synthesized from our output of Universal Adaptor is comparable to those synthesized from ground truth Mel-spectrogram no matter in single-speaker or multi-speaker scenarios. Moreover, Universal Adaptor can be applied in the recent TTS systems and voice conversion systems without dropping quality.
Authors
(none)
Tags
Stats
Related papers
- Universal Melgan: A Robust Neural Vocoder For High-fidelity Waveform Generation In Multiple Domains (2020)0.00
- Multi-spectrogan: High-diversity And High-fidelity Spectrogram Generation With Adversarial Style Combination For Speech Synthesis (2020)0.00
- Adapitch: Adaption Multi-speaker Text-to-speech Conditioned On Pitch Disentangling With Untranscribed Data (2022)0.00
- High-quality Speech Synthesis Using Super-resolution Mel-spectrogram (2019)0.00
- A Unified Speaker Adaptation Method For Speech Synthesis Using Transcribed And Untranscribed Speech With Backpropagation (2019)0.00
- ADAPTERMIX: Exploring The Efficacy Of Mixture Of Adapters For Low-resource TTS Adaptation (2023)6.34
- Msdtron: A High-capability Multi-speaker Speech Synthesis System For Diverse Data Using Characteristic Information (2021)4.52
- Melgan-vc: Voice Conversion And Audio Style Transfer On Arbitrarily Long Samples Using Spectrograms (2019)0.00