Dnn-based Cross-lingual Voice Conversion Using Bottleneck Features
2019 Β· M Kiran Reddy, K Sreenivasa Rao
Abstract
Cross-lingual voice conversion (CLVC) is a quite challenging task since the source and target speakers speak different languages. This paper proposes a CLVC framework based on bottleneck features and deep neural network (DNN). In the proposed method, the bottleneck features extracted from a deep auto-encoder (DAE) are used to represent speaker-independent features of speech signals from different languages. A DNN model is trained to learn the mapping between bottleneck features and the corresponding spectral features of the target speaker. The proposed method can capture speaker-specific characteristics of a target speaker, and hence requires no speech data from source speaker during training. The performance of the proposed method is evaluated using data from three Indian languages: Telugu, Tamil and Malayalam. The experimental results show that the proposed method outperforms the baseline Gaussian mixture model (GMM)-based CLVC approach.
Authors
(none)
Tags
Stats
Related papers
- Building Bilingual And Code-switched Voice Conversion With Limited Training Data Using Embedding Consistency Loss (2021)0.00
- Building Multi Lingual TTS Using Cross Lingual Voice Conversion (2020)0.00
- Time-contrastive Learning Based Deep Bottleneck Features For Text-dependent Speaker Verification (2019)9.92
- Autocycle-vc: Towards Bottleneck-independent Zero-shot Cross-lingual Voice Conversion (2023)0.00
- Language Identification With Deep Bottleneck Features (2018)0.00
- Disentangleing Content And Fine-grained Prosody Information Via Hybrid ASR Bottleneck Features For Voice Conversion (2022)10.48
- Voice Conversion Based On Cross-domain Features Using Variational Auto Encoders (2018)11.29
- Towards Natural And Controllable Cross-lingual Voice Conversion Based On Neural TTS Model And Phonetic Posteriorgram (2021)0.00