Cross-lingual Multi-speaker Text-to-speech Synthesis For Voice Cloning Without Using Parallel Corpus For Unseen Speakers
2019 Β· Zhaoyu Liu, Brian Mak
Abstract
We investigate a novel cross-lingual multi-speaker text-to-speech synthesis approach for generating high-quality native or accented speech for native/foreign seen/unseen speakers in English and Mandarin. The system consists of three separately trained components: an x-vector speaker encoder, a Tacotron-based synthesizer and a WaveNet vocoder. It is conditioned on 3 kinds of embeddings: (1) speaker embedding so that the system can be trained with speech from many speakers will little data from each speaker; (2) language embedding with shared phoneme inputs; (3) stress and tone embedding which improves naturalness of synthesized speech, especially for a tonal language like Mandarin. By adjusting the various embeddings, MOS results show that our method can generate high-quality natural and intelligible native speech for native/foreign seen/unseen speakers. Intelligibility and naturalness of accented speech is low as expected. Speaker similarity is good for native speech from native speake
Authors
(none)
Tags
Stats
Related papers
- Learning To Speak Fluently In A Foreign Language: Multilingual Speech Synthesis And Cross-language Voice Cloning (2019)15.03
- Latent Linguistic Embedding For Cross-lingual Text-to-speech And Voice Conversion (2020)0.00
- The THU-HCSI Multi-speaker Multi-lingual Few-shot Voice Cloning System For LIMMITS'24 Challenge (2024)0.00
- Towards Natural Bilingual And Code-switched Speech Synthesis Based On Mix Of Monolingual Recordings And Cross-lingual Voice Conversion (2020)0.00
- Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data (2021)0.00
- Transfer Learning From Speaker Verification To Multispeaker Text-to-speech Synthesis (2018)0.00
- Voice Cloning: A Multi-speaker Text-to-speech Synthesis Approach Based On Transfer Learning (2021)0.00
- A Novel Cross-lingual Voice Cloning Approach With A Few Text-free Samples (2019)0.00